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ABSTRACT 


This  thesis  presents  two  methods  for  modelling  urban 
air  pollution  using  the  time  series  analysis  technique.  This 
technique  is  different  from  the  atmospheric  diffusion 
technique  commonly  applied  to  modelling  air  pollution,  a 
brief  review  of  which  is  given  in  Chapter  I  along  with  other 
works  in  urban  air  pollution.  The  two  types  of  models 
developed  here  are  the  stochastic  and  the  dynamic  system 
models,  which  are  established  to  describe  the  behavior,  in 
Edmonton,  of  a  particular  air  pollutant  called  oxides  of 
nitrogen.  Since  models  from  Time  Series  Analysis  are  based 
on  observed  data,  Chapter  II  is  devoted  to  describing  the 
source,  precision,  and  organization  of  the  data  used  to 
develop  the  models.  In  Chapter  III,  the  stochastic  models 
are  developed,  tested,  and  used  for  forecasting  while  Chapter 
IV  is  devoted  to  the  development  of  the  dynamic  system  models. 
With  the  oxides  of  nitrogen  data  collected  in  Calgary,  Sarnia, 
Sudbury,  Toronto,  and  Windsor,  stochastic  models  explaining 
the  pollutant's  behavior  in  those  cities  are  developed  in 
Chapter  V  and  compared  with  the  corresponding  model  for 
Edmonton.  This  comparison  resulted  in  a  general  stochastic 
model  for  the  behavior  of  oxides  of  nitrogen  in  the  urban 
atmosphere.  Finally,  in  Chapter  VI  recommendations  and 
suggestions  concerning  the  use  of  the  models,  and  further  use 
of  time  series  analysis  technique  in  building  urban  air 
pollution  models  are  given. 
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CHAPTER  I 


INTRODUCTION  AND  LITERATURE  REVIEW 

1 . 1  Introduction 

Constituents  of  air  pollution  include  dust,  fumes, 
gas,  mist,  odor,  smoke,  vapor  and  noise.  The  mere  existence 
of  any  of  these  pollutants  in  ambient  air  does  not  necess¬ 
arily  constitute  pollution.  Air  pollution  occurs  when  the 
presence  of  one  or  more  of  them  is  in  such  quantity, 
characteristic,  and  duration  capable  of  damaging  or  un¬ 
reasonably  interfering  with  comfortable  enjoyment  of  life 
and  property.  The  injurious  levels  of  concentration  of  most 
of  the  pollutants  as  well  as  the  sources  and  causes  of  the 
pollutants  have  been  identified  by  researchers,  several  of 
whom  are  mentioned  in  the  following  sections.  However, 
knowing  the  source  and  concentration  is  not  enough,  it  is 
also  desirable  to  know  the  fluctuation  of  the  pollutants' 
concentration  with  time  and  space.  That  is,  one  would  like 
to  know  in  advance  when  a  particular  pollutant  is  likely  to 
reach  a  dangerous  level  in  the  atmosphere  so  as  to  prevent 
it  from  doing  so  wherever  possible.  Such  knowledge  is  poss¬ 
ible  to  obtain  through  models  that  explain  the  behavior  of 
the  pollutant  over  time.  There  are  several  air  pollution 
models  in  existence  using  the  diffusion  approach.  As  pre¬ 
dictive  models,  they  give  either  consistently  low  estimates 
or  consistently  high  estimates  of  the  pollutants  as  explained 
in  the  symposium  report  edited  by  Atkisson  and  Gaines  [2]. 
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The  object  of  this  thesis  is  to  present  a  different 
approach  to  air  pollution  modelling.  This  approach  employs 
time  series  analysis.  The  measurements  of  a  particular 
pollutant  observed  over  a  long  period  of  time  will  be 
studied  and  analyzed.  The  analysis  will  yield  a  model  of 
the  stochastic  process  which  generates  the  behavior  of  a 
pollutant  in  an  urban  area.  A  transfer  function  model  of 
a  dynamic  system  with  the  measurements  of  the  pollutant  as 
output  will  also  be  built.  The  models  will  be  used  in  fore¬ 
casting  and  their  forecast  estimates  will  be  compared  with 
the  observed  values  of  the  pollutant  to  prove  the  models' 
potential  as  a  predictive  device. 

The  air  pollutant  that  will  be  so  investigated  is  the 
gas  known  as  oxides  of  nitrogen  (NO  )  .  The  behavior  of 
this  pollutant  in  Edmonton  will  be  studied  and  compared 
with  the  behavior  of  the  same  pollutant  in  some  other  cities 
in  Canada. 

Time  series  analysis  requires  a  great  deal  of  computa¬ 
tion  including  inversion  of  matrices,  iterations  involving 
solutions  of  sets  of  linear  equations,  and  numerical  estima¬ 
tion  of  derivatives.  Since  a  long  series  of  observations  is 
required  to  build  the  models,  time  series  analysis  requires 
facilities  for  large  data  processing.  Therefore,  useful 
results  are  obtainable  from  this  type  of  analysis  only  if  a 
digital  computer  is  employed.  The  investigator  must  have 
both  practical  and  theoretical  knowledge  of  the  computer 
science  discipline  including  numerical  analysis,  and  infor- 
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mation  retrieval.  The  problem  investigated  in  this  thesis, 
therefore,  is  in  the  area  of  the  interface  of  computing 
science  with  one  of  today's  main  problems  affecting  all  of 
humanity . 

1 . 2  Measurement  and  Control  of  Urban  Air  Pollution 
1.2.1  Sources 

In  order  to  measure  and  control  air  pollution,  various 
sources  of  the  pollutants  must  be  known.  Sources  have  been 
recognized  as  Natural  Air  Pollution  Sources,  and  Man-made 
Sources  as  discussed  in  Rossano  [61].  Some  examples  of 
natural  sources  are  swamps  producing  gases  and  odors,  forest 
fires  yielding  smoke  and  flyash,  and  wind  blowing  dust  and 
pollen.  Sources  of  man-made  pollution  cover  a  very  wide 
range.  Only  some  of  the  major  ones  affecting  urban  envir¬ 
onment  will  be  mentioned  here. 

One  of  the  major  man-made  sources  of  air  pollution  is 
combustion.  Combustion  sources  can  be  categorized  as 
follows : 

(a)  Fuel  burning  in  home  heating  units  and  power  plants. 

(b)  Motor  vehicles  represented  by  autos,  buses  and  trucks. 

(c)  Refuse  burning  in  community  and  apartment  house  in¬ 
cinerators  . 

Pollutants  emitted  by  combustion  are  oxides  of  sulfur,  oxides 
of  nitrogen,  carbon  monoxide,  smoke,  flyash,  metal  oxide  , 
particles  and  odor. 

Manufacturing  processes  as  sources  of  air  pollution 
may  be  classed  into  two  categories: 


(a)  Metallurgical  plants,  represented  by  smelters,  steel 
mills,  and  aluminium  refineries,  and 

(b)  Chemical  plants,  represented  by  petroleum  refineries, 
pulp  mills,  super  phosphate  fertilizer  plants,  and 
cement  mills. 

Pollutants  associated  with  manufacturing  processes  include 
particles,  oxides  of  nitrogen,  oxides  of  sulfur,  hydrogen 
sulfide,  fluorides,  organic  vapor,  and  odor. 

Some  other  well  known  man-made  sources  include  the 
following: 

(i)  Nuclear  energy  activities  producing  radioactive  fall 
out. 

(ii)  Dust  producing  processes  like  crushing,  grinding, 

demolition,  and  milling,  which  generate  mineral  and 
organic  particulates. 

(iii)  Agricultural  activities  like  crop  spraying,  field 

turning,  and  straw  burning,  which  produce  pollutants 
like  organic  phosphates,  chlorinated  hydrocarbons, 
smoke  and  flyash. 

The  number  and  the  spatial  distribution  of  these 
sources  are  very  important  in  measurement  and  control.  For 
this  purpose  the  sources  have  been  categorized  into  the 
following  emission  types: 

(a)  Point  sources:  these  sources  have  a  high 

rate  of  emission  and  can  be  easily  recognized.  Example 
are  power  plants,  petroleum  refineries,  and  steel  mills 

(b)  Area-wide  (or  multiple)  sources:  these  consist  of  a 
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large  number  of  smaller  sources  distributed  over  a  well- 
defined  area  like  an  entire  residential  area. 

(c)  Line  sources:  these  for  example  include  freeways,  high¬ 
ways  and  arterials  carrying  a  steady  stream  of  moving 
vehicles . 

1.2.2  Air  Pollutant  Measuring  Devices  and  Units  of 
Measurement 

There  are  two  main  methods  for  the  determination  of 
airborne  contaminant  concentration.  One  is  by  remote  sensing 
and  the  other  is  by  removal  of  particulates  and  gases  from  the 
gas  stream. 

Remote  sensing  and  analysis  does  the  measurement  on 
the  material  in  situ.  The  advantage  of  this  method  is  that 
it  does  not  involve  physical  contact  with  the  substance  and 
hence  no  error  from  alteration  or  modification  is  involved. 
Remote  infra-red  sensing  of  sulfur  dioxide  in  stack  plumes 
is  an  example  of  this  technique. 

The  second  method  collects  particulates  and  gases  from  a 
9 as  stream,  then  subjects  the  material  collected  to  analy¬ 
sis.  For  gases,  a  sample  of  the  air  stream  is  absorbed  in 
specific  reagent  liquids  before  wet  chemical  analysis  is  per¬ 
formed.  There  are  both  manual  and  automatic  devices  using 
this  method.  The  automatic  devices  known  as  Automatic  Cont¬ 
inuous  Analyzers  are  more  desirable  in  air  pollution  monit¬ 
oring.  They  continuously  perform  the  following  operations 
in  sequence:  collect  air  samples,  analyze  them,  and  record 
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the  results  either  in  form  of  a  curve  on  a  strip  chart  or 
by  punching  the  results  on  paper  tape.  These  automatic  de¬ 
vices  are  available  for  only  a  few  of  the  gas  pollutants. 
However, one  of  such  pollutants  for  which  they  are  currently 
available  is  oxides-of-nitrogen .  More  detailed  description 
of  these  and  other  devices  can  be  found  in  Rossano  [61], 
while  names  of  some  of  the  commercially  available  brands 
can  be  found  in  Magill  et  al.  [35]. 

The  commonly  used  units  of  measurement  in  air  pollu¬ 
tion  monitoring  are  parts  per  million  parts  of  air  measured 
by  volume  (ppm) ,  and  micrograms  per  cubic  meter  of  air 

3 

(yg/m  ) .  Gas  pollutants  are  usually  measured  in  ppm  (or 
sometimes  parts  per  hundred  million  (pphm)  while  particulates 

3 

are  measured  in  yg/m  .  In  the  OECD  [50] report,  these  and 
other  units  of  measurement  are  defined. 

1.2.3  Control 

The  main  objectives  of  measuring  and  analyzing  air 
pollutants  are  to  be  able  to  obtain  information  for  setting 
control  standards  and  to  be  able  to  keep  the  pollutants  with¬ 
in  the  control  limits.  There  are  tolerance  thresholds  with 
the  main  air  pollutants.  These  help  in  setting  air  quality 
standards  for  the  pollutants.  Such  standards  for  different 
pollutants  are  well  documented  in  Atkisson  and  Gaines  [2]. 

Once  the  standards  are  known  various  techniques  can 
then  be  applied  to  ensure  that  these  standards  are  not  vio¬ 
lated.  One  kind  of  control  is  to  stop  the  human  activity 
generating  the  pollutant.  Another  form  of  control  effects 
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a  reduction  in  the  emission  rate  of  the  pollutant.  Engineers 
have  been  able  to  effect  emission  reduction  in  three  ways. 

The  first  is  by  process  modification.  An  example  of  process 
modification  is  the  modification  of  automobile  engines  (e.g. 
Chrysler  "Clean  Air  Package"  includes  a  modified  carburetor) 
to  prevent  the  generation  of  excessive  concentrations  of 
carbon  monoxide  and  hydrocarbons.  The  second  method  devised 
by  engineers  is  material  substitution.  Substitution  of,  for 
example,  a  low-sulfur  fuel  such  as  natural  gas  for  a  high- 
sulfur  fuel  reduces  the  emissions  of  sulfur  dioxide.  The 
third  method  is  gas  cleaning.  Gas  cleaning  employs  three 
methods  for  removing  gaseous  contaminants;  absorbing  the 
pollutant  into  a  liquid,  axjsorbing  the  pollutant  onto  the 
surface  of  a  solid,  and  chemically  changing  the  pollutant 
into  a  non-polluting  substance.  For  the  removal  of  part¬ 
iculate  air  pollutants  from  their  gaseous  media  one  or  a 
combination  of  mechanisms  and  forces  are  employed.  Such 
mechanisms  and  forces  include  gravitational  force,  centri¬ 
fugal  force,  magnetic  force  and  thermal  diffusion  giving 
rise  to  devices  like  settling  chambers,  centrifugal  coll¬ 
ectors,  wet  collectors,  and  filters.  The  descriptions  of 
some  of  this  equipment  can  be  found  in  Rossano  [61]. 

1 . 3  Oxides  of  Nitrogen  as  an  Air  Pollutant 
1.3.1  Major  Air  Pollutants 


The  known  major  air  pollutants  are  the  oxides  of 
nitrogen,  of  sulfur,  and  of  carbon,  particulates,  ozone  and 
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oxidant,  odor,  and  organic  contaminants.  Examples  of  some 


of  these  pollutants  are  listed  below. 


Pollutant 

Oxides  of  Nitrogen 

Oxides  of  Sulfur 

Oxides  of  Carbon 

Organic  Contaminants 


Constituents 

Nitric  Oxide  (NO) 
Nitrogen  Dioxide  (N02) 

Sulfur  Dioxide  (S02) 
Sulfur  Trioxide  (SO^) 

Carbon  Monoxide  (CO) 
Carbon  Dioxide  (C02) 

Aldehydes 

Phenols 


In  this  thesis  NO  is  studied  since  it  is  one  of  the  most 

x 

'important'  air  pollutants  -  this  will  become  apparent  in 

the  discussions  below  -  and  because  NO  data  are  available 

x 

over  long  period  of  time ,they  are  suitable  for  Time  Series 
Analysis . 

The  general  term,  NO^,  includes  NO,  N02,  N2°4'  an<^ 
N20j_.  Since  N204  an<^  N2°5  ex^st  only  in  small  quantities 
and  are  not  known  to  be  capable  of  producing  any  adverse 
effects,  they  are  not  important  by  themselves  in  air  pollu¬ 
tion.  The  term  NO  in  air  pollution  often  refers  to  only 

X 

NO  and  N02  together. 

Oxides  of  nitrogen  are  introduced  to  the  atmosphere 
from  a  variety  of  sources.  A  principal  source  of  man-made 
oxides  of  nitrogen  is  the  combustion  of  fossil  fuel,  which 
is  used  in  power  plants,  heating  equipment,  and  internal 
combustion  engines.  NO  is  produced  by  automobile  exhausts 

X 


. 


•  • 
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owing  to  inefficient  burning  of  fuel.  Details  of  the  emis¬ 
sion  of  NO^  by  automobile  exhausts  and  steps  taken  to  con¬ 
trol  it  can  be  found  in  Agnew  [1],  NO  and  NO^  are  known  to 
be  present  in  dangerous  concentrations  (145-1000  ppm)  in 
cigarette  smoke.  Leithe  [33]  contains  detailed  information 
on  cigarette  smoke  and  Lawrence  [32]  describes  efforts  to 

remove  NO  from  tobacco  smoke, 
x 

Production  of  oxides  of  nitrogen  during  combustion  can 
be  expressed  in  the  following  chemical  reaction  equation: 


N2  +  °2  +  2NO . 

After  the  initial  combustion  reaction,  nitric  oxide  further 
reacts  with  oxygen  to  form  NO^  and  higher  oxides.  These 
reaction  processes  are  well  explained  by  Strauss  [69]. 

Since  NO  reacts  with  oxygen  at  ambient  temperature  to  form 
NO ^ /  NO  is  seldom  found  in  appreciable  concentration.  There¬ 
fore  NO  is  the  main  component  of  NO  . 

^  X 


1.3.2  Toxicology  of  NO^ 

Nitrogen  dioxide  (N02)  is  especially  poisonous  and  its 
presence,  even  in  small  quantity  creates  a  health  hazard. 

It  can  be  perceived  by  smell  in  concentrations  as  low  as  0.1 
ppm,  although  the  odor  threshold  may  be  as  high  as  25  ppm 
if  one  is  accustomed  to  it.  With  increasing  dosage  of  N02 
the  following  sequence  of  effects  can  be  observed:  odor  per¬ 
ception,  nasal  irritation,  discomfort  in  breathing,  acute 
respiratory  distress,  pulmonary  oedema,  and  finally,  death. 


More  information  about  these  effects  can  be  found  in  Strauss 


[69]. 

Nitric  oxide  is  not  toxic  in  the  10-50  ppm  range. 
Since  large  NO  concentrations  are  not  stable  in  air  but  are 
converted  to  NO^  little  is  known  about  damages  purely  due 
to  it.  Paralysis  and  convulsion  have  been  reported  after 
exposing  animals  to  NO,  but  no  cases  of  poisoning  in  man 
have  been  reported. 

1.3.3  Effect  of  NO^  and  Photochemical  Smog  on  Plants 

Nitrogen  dioxide  (N02)  a  phototoxic  substance 
since  it  can  cause  damage  to  vegetation.  One  of  the  main 
effects  is  chlorosis  of  leaves  between  the  veins,  where  the 
destruction  of  chlorophyll  results  in  bleaching  of  the 
leaves.  The  concentration  level  beyond  which  bleaching  is 
induced  is  about  3  ppm.  Another  effect  of  the  phototoxic 
NO^  is  the  suppression  of  plant  growth.  Concentration  of 
less  than  1  ppm  affects  plant  growth  according  to  Oglesby 
in  Rossano  [61]. 

Although  oxides  of  nitrogen  by  themselves  may  not 
reach  dangerous  levels  in  the  ambient  air,  enough  evidence 
shows  that  in  the  Los  Angeles  area  the  products  of  photo¬ 
chemical  reactions  initiated  by  nitrogen  dioxide  do  reach 
levels  which  are  harmful  to  humans,  animals  and  plants. 
Middleton  [42]  discusses  such  damages.  A  full  explana¬ 
tion  of  how  photochemical  smog  is  caused  is  given  by  Agnew 
[1],  and  Strauss  [69]. 
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1.3.4  NO  Standards 
x  

Clean  air,  that  is,  air  in  areas  sufficiently  distant 
from  places  of  human  activity  or  other  abnormal  influences, 
is  known  to  contain  approximately  20.93%  by  volume  of  0^? 

7  8.1%  by  volume  of  ?  0.9  3%  by  volume  of  Argon;  0.0  3%  by 
volume  of  CO^;  and  other  minor  gases.  Traces  of  so  called 
pollutants  in  concentrations  below  1  ppm  also  occur  in 
'clean  air',  where  these  pollutants  are  due  to  natural  pro¬ 
cesses  as  discussed  in  Junge  [29].  NO  concentration  in 

X 

'clean  air'  ranges  from  0.000  to  0.030  ppm. 

Air  quality  standards  vary  with  organizations  setting 
them  and  purposes  for  which  they  are  set.  Thus  the  World 
Health  Organization  (WHO)  is  always  interested  in  the 
"hygienic"  level  of  air  quality  or  "no-effect"  level,  while 
some  countries  such  as  Canada,  U.S.A.,  Western  Germany  base 
their  standards  on  presently  achievable  goals.  The  three 
levels  of  air  quality  standards  are  the  adverse  level,  the 
serious  level,  and  the  emergency  level.  Descriptions  of 
these  levels  of  allowable  concentrations  and  durations  of 
time  for  major  pollutants  can  be  found  in  Strauss  [69],  and 
Atkisson  et  al.  [2].  In  order  to  prevent  possible  risk  to 
public  health  and  atmospheric  discoloration,  California  in 
1969  adopted  the  NO r  standard  of  a  concentration  of  0.25  ppm 
for  at  most  one  hour's  duration  in  ambient  air.  For  auto¬ 
mobile  exhaust  emission,  California  in  1971  adopted  4.0 
grams  of  nitrogen  oxides  per  mile  as  the  maximum.  In 
Alberta,  the  Department  of  Health  has  set  0.15  ppm  as  the 
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maximum  acceptable  limit  of  NO  concentration  in  develop- 
ing  the  Alberta  Combined  Air  Quality  Index  (ACAQI) .  However 
this  index,  ACAQI,  is  still  in  the  initial  development 
stage.  In  comparison,  the  maximum  acceptable  level  is  the 
same  as  the  serious  level  mentioned  before.  The  federal 
Government  of  Canada  denotes  the  three  levels  as  maximum 
desirable  level,  maximum  acceptable  level,  and  maximum 
tolerable  level  where  the  maximum  tolerable  level  corres¬ 
ponds  to  the  emergency  level  mentioned  earlier.  More 
examples  of  NO^  concentration  standards  adopted  in  some 
other  countries  can  be  found  in  Hepple  [23], 

In  the  next  section,  the  diffusion  technique  in  air 
pollution  modelling  and  some  existing  diffusion  models 
will  be  discussed  briefly. 

1 . 4  Diffusion  Model 

1.4.1  Basis  of  Diffusion  Models 

Most  efforts  in  modelling  urban  air  pollution  use  the 
diffusion  approach  based  on  two  similar  methods.  The  first 
requires  the  exact  solution  of  an  equation  of  continuity 
for  the  pollutant  where  the  adequate  determination  of  some 
co-efficients  is  necessary.  Details  of  the  equation  of  con¬ 
tinuity  for  an  air  pollutant  are  discussed  in  Atkisson  et  al. 
[2]  while  its  application  to  atmospheric  diffusion  can  be 
found  in  Pasquill  [56].  Since  the  coefficients  required  for 
the  solution  of  the  equation  of  continuity  are  representat¬ 
ives  of  the  product  of  eddy  size  and  eddy  velocity  which  in 
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turn  depend  on  ma$iy  variable^ , the  exact  solution  of  the 
equation  becomes  very  difficult.  As  a  result  the  first 
method  is  not  used  very  often  in  predicting  atmospheric 
concentration  of  pollutants. 

The  second  method  originally  developed  by  Panofsky 
[53]  enjoys  more  popularity  because  it  is  easy  to  work  with. 
In  its  simple  form  for  a  continuous  point  source,  the  model 
for  ground  level  concentration  can  be  expressed  as 


C 


Q 

tTuo’  ~ 

y  z 


exp 
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where  C 
u 

Q 

a 

y 
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is  the  pollutant  concentration, 
is  the  mean  wind  speed  (assumed  constant) , 
is  the  rate  of  pollutant  emission  (gram/sec.), 
is  the  standard  deviation  of  horizontal  plume 
concentration , 

is  the  standard  deviation  of  the  vertical 


z 

plume  concentration,  and 

x,y,z  are  the  spatial  coordinates  representing  the 
downwind,  crosswind,  and  vertical  distances, 
respectively,  with  regard  to  the  origin  at  the 
point  source. 

The  assumption  inherent  in  this  approach  is  that  the  concen¬ 
tration  distribution  from  a  continuous  source  has  a  Gaussian 
distribution  relative  to  the  centre  line  of  the  plume  both 
in  the  vertical,  and  the  perpendicular  direction  to  the  wind. 
Effluent  from  a  continuous  point  source  moves  downwind, 
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spreading  horizontally  and  vertically  such  that  the  dist¬ 
ribution  of  tne  concentration  of  the  contaminant  in  any 
cross  section  along  either  the  horizontal  or  the  vertical 
is  Gaussian  (i.e.,  has  a  normal  distribution)  while  the  mass 
in  such  a  cross  section  is  constant.  The  variances  of  the 
two  normal  distributions  are  functions  of  diffusion  co¬ 
efficients  and  of  distance  x,  downwind.  More  explanation 
of  this  approach  can  be  obtained  from  Moses-  [46]. 

1.4.2  Examples  of  Urban  Air  Pollution  Models 

Several  people  have  used  the  diffusion  approach  to  ur¬ 
ban  air  pollution  modelling. 

Lamb  [31]  in  1968  used  the  approach  to  produce  a  most 
extensive  atmospheric  diffusion  model  for  the  Los  Angeles 
basin.  His  model  was  based  on  the  solution  of  the  contin¬ 
uity  equation.  The  model  was  used  to  compute  carbon  mon¬ 
oxide  concentrations  over  Los  Angeles  for  September  23,  1966 
at  1200  grid  points.  The  results  did  not  agree  satisfactor¬ 
ily  with  observations  owing  in  part  to  some  assumptions  in 
the  model  and  also  to  sources  outside  the  Los  Angeles  basin 
which  were  not  accounted  for.  Despite  its  limitations. 

Lamb's  model  is  one  of  the  great  advancements  in  solving  this 
problem. 

Clarke  [12] used  the  popular  Gaussian  diffusion  app¬ 
roach  to  build  one  of  the  most  well-known  models.  In  his 
model,  the  receptor  (or  monitoring  station)  was  located  at 
the  centre  of  four  concentric  circles  having  radii  of  1,  4, 
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10,  and  20  kilometers,  respectively.  These  circles  were 
divided  into  16  equal  sectors  of  22.5  degrees.  A  source  in¬ 
ventory  was  obtained  for  each  of  the  64  annular  sectors. 

For  each  of  the  annular  rings  a  chart  was  prepared  relating 
C/Q,  as  defined  in  (1.4.1),  and  wind  speed  for  various  at¬ 
mospheric  stability  classes.  To  refine  his  model,  Clarke 
[13]  later  considered  the  contributions  made  by  traffic  flow, 
industry  and  commerce.  The  concentration  at  a  point  was 
then  the  sum  of  contributions  from  the  point  sources,  traffic 
flow,  industry,  and  space  heating  sources.  He  utilized  his 
model  to  compute  S09  and  NO  concentrations  in  Cincinnati, 
Ohio . 

Other  existing  models  using  the  diffusion  approach  in¬ 
clude  those  by  Hilst  [26],  Pooler  [57],  and  Ryan  [63],  Like 
the  two  discussed  above  the  models  show  commendable  efforts, 
as  well  as  pointing  to  the  fact  that  continued  effort  in  the 
development  of  urban  air  pollution  models  is  necessary. 
Therefore,  a  completely  different  approach  to  urban 
air  pollution  modelling  is  worth  considering. 

1 . 5  Computers  and  Pollution  Problem 
1.5.1  Data  Processing 

Any  air  pollution  surveillance  program  has  to  handle 
a  great  deal  of  data.  Efficient  operation  demands  that  the 
data  be  well  organized  and  be  available  for  use  when  it  is 
required.  For  this,  a  good  information  storage  and  retrieval 
system  of  the  quality  and  magnitude  which  only  a  digital  com- 
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puter  can  handle  is  essential.  An  example  of  the  task  for 
such  a  system  is  the  management  of  the  emission  inventory 
information.  Since  sources  of  air  pollution  in  an  urban 
area  increase  with  time  and  the  rates  of  emission  of  these 
sources  do  fluctuate,  continuous  information  about  sources 
is  required.  An  information  system  that  will  handle  this 
task  should  have  a  large  storage  capacity  as  well  as  an 
up-dating  facility.  The  Province  of  Ontario  has  such  a  com¬ 
puterized  source-inventory  management  system  as  explained  by 
Shanks  et  al.  [65]  . 

The  author  of  this  thesis  believes  that  an  air  pollu¬ 
tion  monitoring  network  can  be  handled  better  by  the  use  of 
computers  rather  than  doing  it  manually.  The  measuring  in¬ 
strument  can  be  interfaced  with  an  analog-digital  converter 
which  in  turn  can  be  coupled  with  a  small  computer.  Several 
small  computers  at  different  stations  can  then,  through  com¬ 
munication  lines,  at  required  intervals,  report  back  to  a 
central  computer.  One  of  the  advantages  of  implementing 
this  idea  is  that  human  mistakes  in  reading  and  recording 
data  from  measuring  equipment  will  be  eliminated.  Since 
reports  will  be  received  regularly  through  the  central  com¬ 
puter,  action  to  control  pollution  concentration  could  be 
taken  more  promptly.  In  addition  the  central  computer  can 
be  programmed  to  check  and  report  equipment  break-down  and 
other  such  adverse  situations.  The  point  made  in  this  para¬ 
graph  remains  as  a  suggestion  as  far  as  this  thesis  is  con¬ 
cerned,  efforts  in  this  work  will  concentrate  on  developing 


- 


17 


models,  other  than  the  diffusion  model,  for  NO  . 

X 

1.5.2  Modelling  and  Simulation 

Not:  only  is  the  computer  useful  in  the  monitoring  of 
air  pollution  and  the  storage  and  retrieval  of  the  data,  it 
also  is  indispensable  in  the  analysis  of  the  data,  as  for 
example  in  modelling  and  simulation  of  air  pollution.  As 
discussed  by  Moses  [46],  most  of  the  diffusion  models  deve¬ 
loped  are  impossible  to  handle  without  the  use  of  the  com¬ 
puter  for  the  enormous  computations.  The  indispensability 
of  the  computer  in  using  time  series  analysis  to  develop  an 
air  pollution  model  will  be  quite  obvious  from  Chapters  III 
and  IV  of  this  thesis.  The  use  of  the  model  for  predictive 
purposes  requires  computations  that  are  better  handled  by 
the  computer.  If  done  by  hand  the  result  may  not  be  ready 
in  time  for  action. 

Where  models  are  not  based  on  empirical  data,  the 
supply  of  input  data  has  to  come  from  simulation  ,  For 
diffusion  models  emission  data  from  the  sources  of  the 
pollutant  as  well  as  meteorological  data  are  required  to 
estimate  the  pollutant's  concentration  at  a  point.  How¬ 
ever  in  order  to  study  potential  pollution  distributions  or 
ones  where  only  sparse  data  is  available,  realistic  values 
for  the  diffusion  model  must  be  generated  by  simulation. 
Simulation  then  could  use  source-inventory  information  com¬ 
bined  with  conjectured  meteorological  conditions  to  produce 
input  data  for  the  diffusion  model.  The  Ontario  computerized 
air  pollution  management  system  referred  to  earlier  /[65], 
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incorporates  such  simulation  facility. 

1.5.3  Models  Proposed  in  this  Thesis 

Since  time  series  analysis  deals  with  large  samples 
(long  series)  of  observations  the  computer  is  an  essential 
tool  for  storage  and  accurate  retrieval  of  the  data  and  for 
fast  calculations  with  the  data.  The  computer  will  thus  be 
used  extensively  in  the  development  of  the  two  types  of 
models  for  oxides  of  nitrogen  proposed  in  this  thesis.  The 
first  is  the  stochastic  model  which  is  developed  in  Chapter 
III  while  the  second  is  the  dynamic  system  model  developed 
in  Chapter  IV.  Both  of  the  models  make  use  of  observed  ser¬ 
ies  of  data.  The  stochastic  model  uses  only  the  oxides  of 
nitorgen  data,  whereas,  for  the  dynamic  system  model  the 
pollutant's  observed  values  as  well  as  the  observations 
of  factors  influencing  the  pollutant's  concentration  are 
considered.  As  the  model  building  procedures  in  Chapters 
III  and  IV  will  show,  the  series  of  data  are  analyzed  and 
made  to  supply  information  on  which  the  models  are  based, 
hence  the  models  proposed  in  this  thesis  are  bound  to  be 
representative  of  the  pollutant's  behavior. 


. 


CHAPTER  II 


OBSERVED  NO  DATA:  SOURCE,  PRECISION, 
x 

AND  ORGANIZATION 


In  stochastic  modelling  as  opposed  to  deterministic 
modelling,  past  and  current  observations  of  a  phenomenon  are 
employed  to  establish  a  model  that  explains  the  behavior  of 
the  phenomenon  within  certain  probabilistic  limits.  Thus 
stochastic  model  building  is  strictly  based  on  observed  data 
which  must  be  very  reliable  if  the  model  is  to  explain  the 
phenomenon  adequately.  In  addition  to  reliability  the 
amount  of  data  must  be  adequate.  For  time  series  analysis 
the  data  need  to  consist  of  at  least  fifty  successive  obser¬ 
vations  and  preferably  one  hundred  or  more.  The  reasons  for 
this  data  requirement  will  be  given  later  in  Chapter  III. 

2 . 1  Source  and  Precision 

The  NO^  data  used  to  develop  the  models  are  those  of 
Edmonton.  Models  for  Calgary,  Windsor,  Sarnia,  Toronto  and 
Sudbury  data  were  developed  also  as  comparison  to  the 
Edmonton  model.  Both  the  Edmonton  and  the  Calgary  data  were 
obtained  from  the  Environmental  Health  Services  Division  of 
the  Alberta  Department  of  Health  in  Edmonton,  while  the 
Windsor,  Sarnia,  Toronto  and  Sudbury  data  came  from  the 
Ministry  of  Environment  of  the  Province  of  Ontario.  The 
Alberta  Environmental  Health  Services  Division  has  been  coll¬ 
ecting  and  storing  air  pollution  data  for  the  city  of  Edmonton 
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since  July,  1964.  A  monthly  summary  of  the  data  known  as 
"Air  Pollution  Summary  Edmonton"  is  published  by  the  depart¬ 
ment.  A  similar  publication  exists  for  Calgary  as  well. 

The  main  air  pollutants  for  which  data  are  available 
in  Edmonton  are  particulate  matter  (recorded  as  a  soiling 
index),  oxides  of  nitrogen,  nitrogen  dioxide,  oxidants, 
hydrocarbon,  and  carbon  monoxide.  Continuous  monitoring  of 
Sulfur  Dioxide  is  just  being  developed.  Hourly  averages  of 
NO  and  the  five  minute  peaks  are  recorded  at  only  one 
station  located  on  the  third  floor  of  the  Administration 
Building,  109  Street  and  98  Avenue  which  is  on  the  southward 
fringe  of  the  downtown  area.  Owing  to  frequent  machine  mal¬ 
function  there  are  many  breaks  in  the  record.  Another  cause 
of  discontinuity  of  the  record  is  the  zeroing  of  the  in¬ 
strument  which  is  done  every  day.  This  accounts  for  loss 
of  two  hourly  observations  per  day,  since  only  one  machine 
is  used.  At  present  the  zeroing  takes  place  between  3  and 
5  p . m. 

The  measuring  instrument  is  a  Beckman  Atmosphere 
Analyzer  Model  K1004,  which  is  manufactured  by  the  Beckman 
Company  for  continuous  measurement  of  oxides  of  nitrogen. 
The  specifications  for  the  Beckman  Model  K1004  state  that 
the  ppm  NO  in  the  air  are  measured  with  a  precision  of  3 

X 

decimal  places.  The  instrument  quantitatively  introduces 
both  the  air  sample  and  the  reagent  solution  into  a  con¬ 
tactor  where  the  contaminant  is  absorbed  into  the  solution. 
After  the  liquid  is  separated  from  the  air  stream,  it  flows 


. 
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into  the  detector  where  the  chemical  analysis  is  performed  by 
coulometry .  The  signal  from  the  detector  is  amplified  and 

fed  into  a  read-out  device,  in  this  case  a  chart  recorder, 

which  draws  a  continuous  curve  of  the  amount  of  NO  in  the 

x 

analyzed  air  samples.  Hourly  averages  of  NO  are  then  ob- 

X 

tained  by  a  technician  who  reads  these  off  the  chart  by  "eye¬ 
balling".  These  hourly  averages  are  then  recorded  and  stored 
as  part  of  the  data  base  of  NO  measurements.  The  principle 
and  technique  upon  which  the  instrument  is  based  were  pro¬ 
posed  by  Saltzman  [64]. 

Although  NO  has  been  monitored  in  Edmonton  since  1964, 

X 

it  is  not  easy  to  obtain  NO  series  long  enough  for  time 

X 

series  analysis  owing  to  the  reasons  given  earlier  in  this 

section.  Missing  data  which  create  holes  in  the  records 

break  the  observations  into  very  short  series  most  of  which 

are  less  than  fifty  data  points  long.  Since  long  series  of 

about  100  data  points  are  desirable  for  time  series  analysis 

a  large  part  of  the  records  are  not  useful  for  the  analysis. 

However  by  searching  through  the  whole  data  base,  series  long 

enough  to  study  hourly  and  daily  behavior  of  NO  are  found. 

The  NO^  Hourly  Average  series  which  was  chosen  for 

analysis  is  that  for  the  two  weeks,  February  22  to  March  7, 

1967  which  contained  336  hourly  measurements  without  any 

"holes"  (missing  observations) .  For  the  purpose  of  this 

work  maximum  NO  concentration  for  a  day  is  defined  as  the  max- 

imum  of  the  hourly  averages  for  the  day.  The  daily  maximum 

concentration  of  NO  obtained  as  just  defined  constitute  the 

x 
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Daily  Maximum  NO  series  from  April  1,  1971  to  March  31,  1972 

X 

used  to  study  the  daily  behavior.  This  part  of  the  record 
has  very  few  holes  to  be  filled  as  compared  to  other  parts. 
About  twenty  holes  were  encountered  and  nearly  all  of  them 
consist  of  one  missing  observation  while  the  rest  have  five 
or  less  observations  missing.  The  holes  that  existed  were 
filled  by  taking  the  average  of  one  observation  before  the 
hole  and  one  after  it.  In  this  way  366  consecutive  daily- 
maximum  observations  were  obtained  for  the  analysis. 

2 . 2  Organization 

Oxides  of  Nitrogen  data  are  merely  a  part  of  the  pol¬ 
lution  data  collected  for  the  City  of  Edmonton  as  the  last 
section  indicates.  In  addition  to  the  pollutants'  data, 
wind  data  are  also  collected  and  stored  along  with  the  air 
pollution  data.  These  data,  as  they  are  collected  each  day, 
get  punched  on  computer  cards  and  stacked  in  monthly  decks. 
The  monthly  decks  contain  an  average  of  400  cards,  and,  hence 
up  to  March  1972,  data  for  93  months  were  contined  on  about 
37,200  computer  cards.  In  order  to  make  these  data  more 
amenable  to  computer  processing  in  this  work  they  are  trans¬ 
ferred  to  magnetic  tape.  The  details  of  how  the  data  are 
organized  and  stored  on  tape  and  how  they  can  be  retrieved 
is  discussed  in  Appendix  A. 
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CHAPTER  III 


STOCHASTIC  MODEL  BUILDING 

As  stated  in  Chapter  I  two  types  of  models  are  deve¬ 
loped  in  this  thesis  to  represent  the  behavior  of  NO^.  The 
first,  referred  to  as  the  stochastic  model,  recognizes  the 
time-dependent  nature  of  a  phenomenon  in  the  presence  of  the 
unknown  factors  influencing  its  behavior.  The  second  model 
type  is  the  combined  transfer  function-noise  model  which  also 
takes  into  account  time-dependence,  and  in  addition  recog¬ 
nizes  the  dynamic  relationship  between  the  phenomenon  and 
the  influencing  factors.  Hence,  the  transfer  function- 
noise  model  is  the  system  analysis  approach  to  the  problem, 
where  the  phenomenon  observed  (in  this  case  NO  )  is  the  out- 

X 

put  from  the  system  and  the  factors  influencing  the  behavior 
(for  example  temperature)  of  the  phenomenon  are  the  inputs. 
The  first  model  type  will  be  established  for  NO^  in  the 
rest  of  this  chapter  while  the  second  will  be  developed  in 
Chapter  IV. 

In  designing  and  testing  the  models  for  the  NO  data 
the  methods  developed  by  Box  and  Jenkins  [6]  will  be  used 
extensively  and,  hence,  the  details  of  the  theoretical  dev¬ 
elopment  and  the  historical  background  of  those  techniques 
will  not  be  restated  here.  The  present  analysis  of  the  NO 

X 

data  will  thus  use  the  Box  and  Jenkins  [6]  approach  as  a 
tool  in  solving  the  problem  of  how  the  oxides  of  nitrogen 
process  behaves. 
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3.1  Stochastic  Models  in  NO^  Data  Analysis 


A  stochastic  process  is  a  statistical  phenomenon  that 
evolves  in  time  according  to  probabilistic  laws;  while  a 
time  series  is  a  set  of  values  observed  sequentially  in  time 
and  thus  is  one  particular  realization  of  the  underlying 
stochastic  process.  Observation  of  a  continuous  time  series 
can  be  made  at  some  fixed  interval  giving  a  discrete  se¬ 
quence  of  observations.  NO^  observations  dealt  with  here 
constitute  discrete  time  series  since  the  observations  are 
recorded  for  fixed  time  intervals.  The  first  series  consists 
of  the  NO  hourly  averages  where  the  observations  are  close 

X 

to  the  average  value  over  a  one-hour  interval  and  hence  the 
time  scale  is  in  hours.  The  second  series  consists  of  the 
daily  maxima  of  NO  where  the  maximum  of  the  hourly  averages 
is  taken  for  each  day  and  hence  this  time  scale  is  in  days. 

If  a  time  series  comes  from  a  stationary  process 
then  the  underlying  process  from  which  it  has  been  observed 
may  be  an  autoregressive  (AR)  process,  a  moving  average  (MA) 
process,  or  a  combination  of  both  called  autoregressive- 
moving  average  (ARMA)  process.  A  detailed  and  rigorous 
definition  of  the  stochastic  stationary  process  is  given  in 
Box  and  Jenkins  [6].  There  are  various  ways  in  which  a  pro¬ 
cess  can  be  non-s tationary  and  so  there  are  many  ways  in 
which  a  time  series  observed  from  a  non-stationary  process 
can  exhibit  non-stationary  behavior.  The  number  of  ways  non- 
stationary  behaviors  can  be  displayed  by  a  time  series  will 
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not  be  enumerated  here;  they  can  be  found  in  Box  and  Jenkins 
[6],  but  a  particular  kind  of  non-stationary  behavior  which 
the  two  NO  series  display  is  hereby  mentioned.  Like  many 

X 

empirical  time  series,  they  behave  as  though  they  have  no 
fixed  mean,  but  they  exhibit  homogeneity  in  the  sense  that, 
apart  from  a  local  level,  or  perhaps  a  local  level  and  trend 
one  part  of  the  series  behaves  much  like  any  other  part.  The 
stochastic  process  which  underlies  such  homogeneous  non- 
stationary  behavior  can, by  differencing, be  changed  to  a 
stationary  process,  that  is  the  resulting  process  of  the 
suitable  difference  may  be  an  AR,  MA,  or  ARMA  process. 
Therefore,  such  series  may  be  representable  by  the  modified 
form  of  the  ARMA  called  autoregressive  integrated  moving 
average  (ARIMA)  Models.  Non-stationary  behavior  of  a  series 
can  be  discovered  through  the  sample  autocorrelation  function 
as  will  be  shown  later  in  this  chapter.  The  sample  auto¬ 
correlation  function  also  can  be  used  as  an  aid  in  ident¬ 
ifying  which  of  the  three  types  of  models  to  try  and  fit  to 
the  series  exhibiting  stationary  behavior. 

It  is  necessary  to  give  a  brief  definition  of  the  terms 
autoregressive  and  moving  average  at  this  point.  Let  z  , 


z^_^,  zt_2/  .  be  the  values  of  a  process  at  equally 

spaced  time  points  t,  t-1,  t-2,  . ;  and  let  yt, 

y ,  . .  where  y^  =  z.  -  u,  be  the  deviations  from  the 

J t-2  ^  t  t 


mean,  u,  of  the  stationary  series.  Then  the  model,  relating 
the  present  value  y  ,  to  the  previous  values,  can  be  ex¬ 
pressed  as 


' 
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Llyt-1  +  a2yt-2 


+  a 


>yt-i 


(3.1.1) 


and  is  referred  to  as  an  autoregressive  (AR)  model  of  order 
p,  because  y  is  "regressed"  on  previous  values  of  the  same 
process.  The  coefficients  a^  (i  =  1,2,  .......  p)  are  re¬ 
ferred  to  as  the  autoregressive  parameters.  The  series  of 
shocks  e^_,  (known  as  "white  noise")  consists  of  a  sequence 
of  uncorrelated  random  variables  whose  distribution  is 

usually  assumed  to  be  normal  with  mean  zero  and  constant  var- 

2 

lance  0q  .  The  moving  average  model  expresses  y  in  terms 
of  present  and  past  random  shocks. 


yt  et  blet-l  “  b2et-2 


b  e.  ,  (3.1.2) 

q  t-q 


where  the  coefficients  b^  (j  =  1,  2,  . .  q)  are  referred 

to  as  the  moving  average  parameters.  The  moving  average 
model  is  equivalent  to  a  linear  filter  with  a  finite  number 

of  non -zero  weights  .  ARMA  model  is  a  combination  of  (3.1.1) 
and  (3.1.2)  . 

The  general  form  of  the  stochastic  model  that  will  be 
considered  here  can  be  written  as 


a(B)  {  (1-B)dzt  }  =  b(B)et 


(3.1.3) 


where  B  is  the  backward  shift  operator  defined  as 


27 


Bzt  =  zt-l  and  B  zt  “  zt-d  '  and 

a  (B)  =  1  -  a,  B  -  a^B2  -  .  -  a  Bp  , 

12  P 

b  (B)  =  1  -  b1B  -  b2B2  -  .  -  bgBq  , 

z^is  the  observed  value  (i.e.  transformed  as  discussed  in 

section  3.2)  of  NO  at  time  t,  {  (1  -  B) dz  }  is  the  differ- 

X  t 

enced  series,  and  e^  is  the  random  shock  at  time  tf  The 
integers  d,  p,  and  q  in  turn  denote  the  following: 

d  is  the  degree  of  differencing  performed  on  the 

data, 

p  is  the  number  of  autoregressive  parameters  (or 

order  of  the  AR  model  part) , 
q  is  the  number  of  moving  average  parameters  (or 

order  of  the  MA  model  part) . 

The  general  model  (3.1.3)  is  referred  to  as  an  auto¬ 
regressive  integrated  moving  average  model  ARIMA  (p,  d,  q) . 
The  same  model  can  be  written  as  an  ARMA  (p,  q)  model 

a(B)wt  =  b(B)et  ,  (3.1.4) 

where  w^  =  (1  -  B)  z^  is  a  stationary  series  whose  length, 
n  =  N  -  d.  In  the  next  section  the  specific  stochastic 
models  that  explain  the  behavior  of  the  NO  hourly  averages 

X 


and  the  daily  maxima  of  NO  will  be  identified. 

X 

3 . 2  Stochastic  Model  -  Identification 

The  sets  of  the  NO^  raw  data  presented  in  Series  A 
and  B  were  transformed  to  their  natural  logarithms.  Since 
the  measurements  constituting  the  data  are  very  small 
numbers  in  the  range  0.000  to  0.500  ppm,  and  since  the 
natural  logarithm  of  zero  is  indeterminable,  0.001  was 
added  to  each  observed  value  of  the  series  before  logarithmi 
transformation.  Thus  the  original  series  {zt>  is  trans¬ 
formed  to  the  series  {  In  (z^  +  0.001)  }.  Because  all  sub¬ 
sequent  analyses  will  be  performed  on  the  transformed  series 
any  reference  to  z  from  this  point  on  is  a  reference  to 
In  (z^  +  0.001)  except  where  a  distinction  is  made. 

The  transformed  time  series,  now  referred  to  as 
{ z  },  are  plotted  in  Figures  3.1  and  3.2.  Since  there  app¬ 
ears  to  be  variation  in  level,  no  fixed  mean  seems  to  exist, 
however,  a  "seasonal"  cycle  is  not  apparent  either.  Series 
behaving  in  this  way  come  from  non-stationary  processes. 
Nevertheless,  appropriate  differencing  of  the  transformed 
series  may  result  in  a  stationary  series  which  then  would 
be  suitable  for  the  proposed  analysis. 

It  is  conjectured  here  that  the  amount  of  differenc¬ 
ing  required  to  get  a  stationary  series  may  be  indicated  by 
comparing  the  variance  of  the  undifferenced  series  with  the 
variances  of  the  various  differenced  series.  A  reduction  in 
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Fig.  3.1  NO  Hourly  Averages  (Series  A  transformed) 
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Fig.  3.2  Daily  Maxima  of  NO  (Series  B  transformed) 
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the  observed  (sample)  variance  seems  to  indicate  less  in¬ 
stability  of  tne  series.  For  the  two  series  under  consider¬ 
ation,  the  first  differences  have  a  lower  variance  than  the 
others  as  shown  in  Tables  3.1  and  3.2. 

TABLE  3.1  Differences  for  transformed  Hourly 

Averages  of  NO^ 


Difference 

Length  of  Series 

Mean 

Variance 

No  Differencing 

336 

-5.495 

1.202 

(1  -  B) 

335 

-0.004 

0.444 

(1  -  B)  (1  -  B) 

334 

0.007 

2.970 

24 

(1  -  BZq) 

312 

-0.021 

2.497 

(1  -  B)  (1  -  B24) 

311 

-0.013 

1.452 

TABLE  3.2  Differences  for  transformed 

Maxima  of  NO 

X 

Daily 

Difference 

Length  of  Series 

Mean 

Variance 

No  Differencing 

366 

-4.111 

1.577 

(1  -  B) 

365 

0.000 

0.981 

(1  -  B)  (1  -  B) 

364 

-0.008 

4.587 

(1  -  B7) 

359 

0.019 

1.881 

(1  -  B)  (1  -  B7) 

358 

0.003 

2.115 

* 
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The  24th  difference  of  the  Hourly  Averages  was  taken  to  see 
whether  or  not  the  variation  in  level  was  due  to  a  24  -  hour 
'seasonal'  pattern;  while  the  7th  difference  of  the  Daily 
Maxima  was  taken  to  investigate  the  7  -  day  'seasonal' 
pattern  that  could  exist  in  the  Daily  Maximum  of  NO^  series. 
The  results  in  the  tables  show  that  those  "seasonal" 

patterns  do  not  seem  to  exist  in  the  two  series,  otherwise 
those  differenced  series  would  have  had  lower  variances. 
Thus  the  stationary  model  type  that  should  be  investigated 
first  for  both  series  is  the  ARMA  (p,  q)  model  of  the  form 
(3.1.4)  where  d  =  1. 

An  important  and  more  reliable  indicator  of  the  degree 
of  differencing  needed  is  the  autocorrelation  function 
Pzz(k)  ,  where 

Pzz(k)  =  E  [  (Zt  -  U)  (Zt+k  -  U)  ]  /  crz2  ,  (3.2.1) 

U  is  the  mean  of  {Z^},  and  k  is  the  lag  or  time  difference 
between  the  values  of  the  time  series  considered  for  corre¬ 
lation.  The  failure  of  the  autocorrelation  function  to 
rapidly  decay  indicates  non-stationary  behavior  on  the  part 
of  the  series,  hence,  further  differencing  and  or  trans¬ 
formations  must  be  considered  and  tried.  For  both  of  the 
observed  time  series  (in  this  case  their  transformations) 
we  obtain  the  corresponding  sample  autocorrelation  functions 
r  (k) ,  referred  to  simply  from  here  on  as  the  autocorre- 
lation  function.  Figures  3.3  and  3.4  show  the  autocorre- 
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lation  functions  of  the  undifferenced  hourly  average  and 
daily  maximum  series  respectively;  while  Figures  3.5  and  3.6 
show  the  autocorrelation  functions  of  the  respective  first 
difference  {W^} .  The  autocorrelation  function  of  the  daily 
maxima  of  NO  plotted  in  Figure  3.4,  certainly  decays 
rather  slowly,  thus  indicating  non-stationary  behavior. 
However,  upon  taking  the  first  difference  the  corresponding 
autocorrelation  function  shown  in  Figure  3.6  indicates  com¬ 
paratively  low  values  beyond  lag  2. 

Further  examination  of  the  autocorrealtion  function 
reveals  the  specific  types  of  models  which  may  be  fitted  to 
the  data.  Models  may  be  categorized  according  to  the  number 
of  autoregressive  and  moving  average  terms  to  be  considered, 
[i.e.  (p,  q)  ].  Since  the  theoretical  autocorrelation 

function  is  known  for  a  given  model,  the  shapes  of  these 
functions  may  broadly  be  classed  into  two  categories;  the 
exponentially  decaying  functions  and  the  functions  display¬ 
ing  exponential  decay  mixed  with  damped  sine  functions.  The 
examples  below  indicate  the  shapes  of  the  theoretical  auto¬ 
correlation  functions  for  some  of  the  model  types. 

(a)  ARMA  (1,0):  wt  =  awt_]_  +  et  (3.2.2) 

The  autocorrelation  function  p(k),  decays 
exponentially  where  p(l)  =  a,  and  the  residual 
variance 

2  ,,  2. 
a  =  (1  -  a  )  a 
e  w 
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(b) 


ARMA( 0 ,  1)  :  w. 


et  ’  bet-l 


The  autocorrelation  function  has  value 

zero  beyond  lag  1, 

p  (1)  =  -b  /  (1  +  b2)  for 

-1  <  b  <  1;  and, 

2 


a 


2 


w 


1+b' 


(3.2.3) 


(c) 


ARMA ( 2 /  .  0 )  :  wfc  =  a^w  ^  +  a2Wt-2  +  et 
The  autocorrelation  function  is  a 

mixture  of  exponential  decay  and  a 
damped  sine  wave.  The  parameters 
a^,  a^ ,  are  given  by 


(3.2.4) 


al 

= 

P(l) 

[1 

-  p(2) ]  /  [1 

a2 

= 

[  P  (2) 

— 

P2(l)  1  /  [1 

where 

for 

stationarity  of 

-1 

< 

a2  < 

1/ 

a2 

+ 

ai  < 

1/ 

and 

a2 

— 

ai  < 

1. 

ae  =  [1  -  a1p(l) 


a2p(2)  ]  °w' 


(d) 


ARMA  ( 0 ,  2)  :  -  b^t-l  ”  ^2et-2 

The  autocorrelation  function  is 
zero  beyond  lag  2.  The  parameters 
b^r  b^,  are  given  by 

p  (1)  =  -b1  (1  -  b2)  /  (1  +  bx2  +  b22), 


(3.2.5) 


. 
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P  ( 2 )  =  -b^  /  (1  +  +  b^2)  where 

for  invertibility  of  the  model 
-1  <  b2  <  1, 
b2  +  b^  <  1,  and 


a  +  b12  +  b22) 


(e)  ARMA  ( 1 ,  1):  +  e^_  -  (3.2.6) 

The  autocorrelation  function  decays 
exponentially  as  of  lag  1.  The 
parameters,  a  and  b,  can  be  obtained 
from 

p(l)  =  (1  -  ab)  (a-  b)  /  (1  +  b  ^  -  ab) 
p(2)  =  ap(l)  where 

-1  <  a  <  1,  and 

-1  <  b  <  1  in  order  for  the  model  to 
be  stationary  and  invertible. 

2  =  gw  [1  -  a  ] 
ae  ( 1-ab)  (a-b) 

Since  the  theoretical  autocorrelation  function  is  not 

known,  the  observed  sample  autocorrelation  function  is  used 

as  its  estimate.  The  observed  autocorrelations  at  lag  k, 

k  =  1,  2,  3,  . .  K  and  K  =  N/4  are  sufficient  to  identify 

the  model.  The  sample  autocorrelation  function,  r  (k)  or 

c  ww 


. 
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simply  r(k)  is  obtained  from  the  sample  autocovariance 
function  C  (k)  or  simply  C(k).  Hence,  the  estimator  of 
P  (k)  is  r (k) , 

r  (k)  =  C  (k)  /  C(0)  ,  (3.2.7) 

where  C(k)  is  the  estimate  of  the  theoretical  autocovariance 
of  lag  k. 


x  N-k 

C{k)  =  n  (wfc  -  57)  (wfc+k  -  w)  ,  (3.2.8) 

for  k  =  0,  1 ,  2 ,  . ,  K 

C(0)  is  the  sample  estimate  of  the  theoretical  var- 
iance  of  the  series  {wt} ,  and  v7  is  the  observed  average 

of  the  series.  Usually  w  is  zero  since  w  is  the  difference 
of  the  series  {z^}.  It  should  be  noted  that  to  obtain  a 
useful  estimate  of  the  autocorrelation  function,  at  least 
50  observations  are  needed. 

Figure  3.5  shows  that  the  hourly  NO  may  perhaps  just 
be  an  ARMA(0,  0),  namely  just  white  noise  process  written 

as  w^  =  e^_,  because  r(l),  r(2),  . .  r(k)  do  not  appear 

to  be  very  different  from  zero.  Figure  3.6  shows  that  the 
daily  NO  may  be  fitted  by  an  ARMA (O ,  2)  model  otherwise 

X 

known  as  an  MA  model  of  order  2,  see  (3.2.5) .  It  can  be 
seen  that  after  lag  2  the  observed  autocorrelations  do 
not  appear  to  be  very  different  from  zero.  These  con¬ 
jectured  models  will  be  used  as  starting  points  in  trying  to 
find  the  appropriate  models  for  the  two  sets  of  NO  data. 

X 
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3 . 3  Stochastic  Model  -  Preliminary  Estimation 

Tne  use  of  the  autocorrelation  function  (or  autoco¬ 
variance  function)  in  the  initial  estimation  of  the  para¬ 
meters  and  the  residual  variance  can  be  extended  to  deal 
with  models  of  more  than  two  parameters.  In  determining 
initially  which  model  to  use  in  the  preliminary  stage,  it 
must  be  pointed  out  that,  although,  a  given  ARMA  model  has 
a  unique  covariance  structure,  a  particular  covariance 
structure  does  not  determine  an  ARMA  model  uniquely.  In 
other  words  there  is  no  one  to  one  correspondence  between 
autocovariance  functions  (or  autocorrelation  functions)  and 
ARMA  models. 

One  of  the  initial  estimation  tools  is  the  Yule  - 
Walker  equation  in  Jenkins  and  Watts  [28],  which  uses  the 
observed  autocorrelations  to  solve  for  parameters  in  auto¬ 
regressive  models.  For  instance,  let  the  model  be  autore¬ 
gressive  and  of  order  5,  then  the  five  autoregressive  para¬ 
meters  will  be  obtained  by  solving  the  following  set  of  five 
linear  equations: 


a^r (0)  +  a2r(l) 
a^r (1)  +  a2r(0) 
axr(2)  +  a2r(l) 

alr  O)  +  a2r  ( 2) 
a1r(4)  +  a2r(3) 


+  a3r(2)  +  a4r 
+  a3r(l)  +  a4r 
+  a3r(0)  +  a4r 
+  a3r(l)  +  a4r 
+  a3r(2)  +  a4r 


(3)  +  a5r(4)  = 
(2)  +  a^r ( 3)  = 
(1)  +  a^r ( 2)  = 
(0)  +  a^r  (1)  = 
(1)  +  a5r(0) 


r  ( 1) 
r:(2) 

r ( 3)  (3.3.1) 

r  (4) 


=  r  ( 5) 


•'  -  ' 
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If  the  model  is  of  order  p,  then  p  such  equations  will  have 

to  be  solved  for  the  p  parameters  a^ ,  i  =  1,  2,  . ,  p. 

After  obtaining  these  initial  parameter  values  an  estimate 
of  the  residual  variance  is  given  by 

_  2 

e  -  C (0)  [1  -  a^r ( 1)  -  a2r (2) -  .  -  a  r(p)]  f 

where  a^^  is  the  estimated  parameter.  For  further  details  on 

this  approximation  of  the  residual  variance  see  Bartlett  [4]. 

A  general  method  of  using  the  autocovariances  for  ob¬ 
taining  initial  estimates  of  the  parameters  of  autoregress¬ 
ive  -  moving  average  processes  is  described  in  Chapter  6  of 
Box  and  Jenkins  [6],  Since  their  method  is  used  to  obtain 
the  initial  estimates  for  the  NO  models,  a  brief  summary 

X 

of  the  algorithm  is  given  in  Appendix  B. 

For  the  problem  on  hand,  a  general  Fortran  IV  program 
(see  Program  2  in  Owolabi  [52])  following  the  algorithm  in 
Appendix  B  is  developed  and  implemented.  Using  the  pro¬ 
gram  the  following  sixteen  ARMA  models  with  the  appropriate 
initial  estimates  were  produced  for  both  the  hourly  and 


daily  NO^  data 

(i.e.  for  the 

first 

differences , 

w^ ,  of  the 

transformed  data) : 

(p , q)  =  (1/0)  , 

(2,0),  (3,0), 

(4,0)  , 

(1/1)/  d/2) 

/  (2,1), 

(2,2) , 

(3,1),  (3,2), 

(4.1)  , 

(4,2),  (0,1) 

,  (0,2), 

(0,3)  , 

<o,4).  ' 

Table  3.3  shows  the  initial  parameter  estimates  obtained  for 
some  of  these  models  for  hourly  NO  while  Table  3.4  shows 

X 


ri 
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the  same  thing  for  daily  NO^. 


TABLE  3.3  Initial  Parameter  Estimates  for  transformed 
Hourly  Averages  of  NO^ 


Model  for  w 

Parameter 

al  a2 

Estimates 

bl  b2 

Residual 

Variance 

ARMA (1,0) 

-0.039 

— 

— 

- 

0.444 

ARMA (1,1) 

0.310 

— 

0.349 

— 

0.444 

ARMA (1,2) 

6.769 

— 

0.185 

-0.006 

20.362 

ARMA (2,0) 

-0.040 

-0.014 

— 

0 . 444 

ARMA (2,1) 

-0.376 

2.213 

0.068 

— 

2.725 

ARMA (0,1) 

— 

— 

0.039 

— 

0.444 

ARMA (0,2) 

— 

— 

0.039 

0.012 

0.444 

By  examining  the  initial  estimates  of  the  residual  variances 
and  of  the  parameters  of  these  models,  the  model  with  the 
smallest  variance,  and  with  parameters  enjoying  the 
stationarity  and  invertibility  properties  is  the  model 
which  promises  to  be  the  "best-fitting"  one.  Details  re¬ 
garding  the  invertibility  and  stationarity  conditions  are 
given  in  Box  and  Jenkins  [6].  Another  factor  that  needs 
to  be  considered  in  identifying  the  model  that  may  fit  best 
is  parsimony  in  the  use  of  parameters.  A  model  with  less 
parameters  than  the  "  optimal  model  "  and  with  residual 


■ 
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TABLE  3.4  Initial  Parameter  Estimates  for  transformed 
Daily  Maxima  of  NO^ 


Model  for  w 

Parameter 

al  a2 

Estimates 

bi 

b2 

Residual 

Variance 

ARMA (1,0) 

-0.298 

— 

— 

— 

0. 89  3 

ARMA (1,1) 

0.513 

— 

0. 891 

— 

0.875 

ARMA (1,2) 

-0.203 

— 

0.227 

0.260 

0.806 

ARMA (2,0) 

-0.377 

-0.265 

— 

— 

0.830 

ARMA (2,1) 

0.768 

-0.498 

-0.135 

— 

7.450 

ARMA (0,1) 

— 

— 

0.331 

— 

0.884 

ARMA (0, 2) 

— 

— 

0.460 

0.191 

0.786 

variance  a  little  bit  larger  than  the  one  for  the  optimal 
may  produce  quite  an  adequate  fit  to  the  data,  and  may  be 
easier  to  interprete  in  its  practical  context.  Considering 
these  points  and  looking  at  Table  3.4  it  is  easy  to  see 
that  ARMA (0,2)  model  may  be  able  to  fit  the  daily  data. 

Table  3.3  shows  that  any  of  the  ARMA(1,0),  ARMA (0,1)  models 
can  fit  the  hourly  data,  but  they  are  not  better  than  the 
ARMA (0,0)  process  conjectured  earlier  for  the  data. 

The  algorithm  employed  for  the  solution  of  the 
sets  of  linear  equations  in  this  work  is  the  Gauss  -  Jordan 
algorithm  described  in  many  numerical  analysis  books  such  as 
the  one  by  Carnahan,  et  al.  [9]. 


. 


The  models  decided  on  after  the  identification  and 


initial  estimation  procedures  must  be  rigorously  "processed" 
in  order  to  obtain  efficient  estimates  of  their  parameters. 
This  processing  will  be  done  for  the  two  models  which  have 
been  identified  for  the  hourly  and  daily  NO^  data  in  the 
estimation  stage  which  is  discussed  in  the  next  section. 

3 . 4  Stochastic  Model  -  Estimation 

The  initial  estimates  of  the  parameters  for  the  models 
identified  in  section  3.3  are  used  here  to  initiate  the 
iterative  procedure  which  produces  the  values  of  the  para¬ 
meters  that  give  the  minimum  residual  variance  (  or  minimum 
residual  sum  of  squares) . 

In  section  3.3  the  process  generating  the  first 
difference  of  the  transformed  daily  maxima  of  NO  (i.e.  w  ) 

X  "C 

was  identified  as  ARMA(0/2)  which  is 

w,  =  e,  -  b,e.  ,  -  b„e ,  „ 

t  t  1  t-1  2  t-2 

For  given  values  of  b-^,  b^  this  model  is  used  to  gener¬ 
ate  recursively  a  series  of  residuals,  efc,  given  by 

e  =  w  +  6,e^  -t  +  £>~e.  ^ 

t  t  1  t-1  2  t-2 

The  wfc,  t  =  1,  . .  N  -  1, constitute  the  series  to  be 

modelled;  the  unknown  starting  values  e^ ,  e_^  are  equated 
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to  the  expected  value  of  efc  which  is  zero.  The  sum  of 
squares  of  the  generated  residual  series,  {e  } ,  is  then 
calculated.  In  this  manner  the  sum  of  squares  can  be 
calculated  for  all  possible  values  of  the  parameters  b^  , 
b2  and  the  set  of  values  of  the  parameters  producing  the 
minimum  sum  of  squares  is  the  best  set  of  parameter  esti¬ 
mates  for  the  fitted  model.  The  algorithm  for  this 
iterative  model  estimation  constitutes  Appendix  C.  A 
Fortran  IV  program  (see  Program  3  in  Owolabi  [52])  is 
written  to  implement  this  algorithm.  Using  that  program  the 
most  efficient  set  of  values  of  the  parameters  of  the 
ARMA (0,2)  model  for  the  daily  NO  data  are 

X 

B^  =  0.500  with  standard  error  of  b^  equal  to  0.045, 

B2  =  0.260  with  standard  error  of  b2  equal  to  0.051, 

d  2  =  0.772 
e 

Here  the  parameters  are  large  compared  to  their  standard 
errors  and  should  be  retained. 

As  discussed  in  section  3.3,  w^  ,  the  first  difference 
of  the  transformed  hourly  NO^  data  seemed  to  behave  like  a 
white  noise  process,  ARMA (0,0) .  However,  one  of  the  sixteen 
initial  ARMA  models  for  the  wt  series  having  a  low  residual 
variance  estimate  is  ARMA(0,1).  The  corresponding  efficient 
estimate  of  its  parameter  b,  can  be  found  at  this  stage. 

Then  the  ARMA (0,1)  model  can  be  compared  to  the  ARMA (0,0) 
to  see  which  one  fits  the  hourly  data  better.  The  ARMA (0,1) 
model  for  the  hourly  data  is 


•  ,  I 

- 


47 


The  series  of  residuals,  e^,  for  the  model  can  be  generated 
for  the  given  value  of  b,  according  to 


et  =  wt  +  bet-i 


where  again  the  value  e^  is  set  to  zero.  Using  the  model 
estimation  algorithm  of  Appendix  C  CL .  e .  the  same  Fortran 
Program)  as  for  daily  data,  the  most  efficient  parameter  of 
the  ARMA (0,1)  model  for  hourly  average  NO  is  found  to  be 

X 


b  =  0.054  with  standard  error  of  b  equal  to  0.064, 

d  2  =  0.444 
e 


Since  the  parameter  is  very  small  compared  with  its  standard 
error  it  cannot  be  considered  to  be  different  from  zero. 
Hence,  the  model  that  fits  the  hourly  series  is  ARMA (0,0) 
and  not  ARMA(0,1).  This  implies  that  the  first  difference 
of  the  transformed  hourly  data  behaves  like  a  white  noise 
process . 

The  model  that  fits  the  transformed  NO  Hourly 

X 

Average  series  is 


which  can  be  written  in  terms  of  the  z  as 


[  . 


48 


For  the  transformed  daily  maximum  NO  series  the  model 

**  X 

that  seems  to  have  the  best  fit  is 


Wt  et  0,5  et~l  0,26  et-2 


which  can  be  written  in  terms  of  z^_  as 


z,  =  z,  ,  +  e,  -  0.5  e,  ,  -  0.26  e.  ~ 
t  t-1  t  t-1  t-2 


Before  finally  accepting  these  models  as  being  repre¬ 
sentative  of  the  processes  generating  the  corresponding 
series  they  should  undergo  some  diagnostic  checks.  These 
diagnostic,  checks  .are  /presented  in  the  .next  section. 


3 . 5  Stochastic  Model  -  Diagnostic  Checks 

In  this  section  three  tests  are  employed  as  diagnostic 
checks  of  the  models.  In  effect  these  check  the  behavior 
of  the  residuals  e  ,  whether  or  not  they  are  white  noise. 

A  good  fitting  model  describes  the  data  behavior  (nearly) 
completely  leaving  only  small  independent  errors  in  the  un¬ 
explained  part. 

The  first  diagnostic  check  is  the  comparison  of  the 
estimated  variance  of  the  fitted  series  (i.e.  w^  in  this 
case)  to  the  estimated  residual  variance.  A  small 


2 
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~  2 

implies  that  most  of  the  is  explained  by  the  model. 

In  the  second  test,  the  series  of  the  residuals  e^  is 
checked.  If  a  model  provides  a  good  fit  to  the  data,  then 
the  residuals  remaining  should  constitute  a  series  of  white 
noise  since  white  noise  is  uncorrelated  (see  section  3.1) 
the  autocorrelation  function  re(k)  =  0,  for  k  >  0-  This 
second  test,  therefore,  involves  estimating  the  autocorre¬ 
lation  function,  r^Ck),  of  the  residual 

series  e^,  t  =  1,  2,  .......  n,  and  performing  a  chi  -  square 

test  on  these  autocorrelations.  The  test  statistic,  hereby 
called  Q,  can  be  shown  to  have  a  Chi  -  Square  distribution 
with  K  -  p  -  q  degrees  of  freedom  where  K  =  n/4.  Q  is 
obtained  by 

K  2 

Q  =  n  Z  r  (k) 
k=l 

If  Q  is  less  than  the  upper  5%  critical  point  for  Chi  - 
Square  with  the  appropriate  degrees  of  freedom  then  the 
hypothesis  that  re(k)  =  0  for  k  >  0  is  accepted,  and  the 
residuals  can  be  taken  as  constituting  a  white  noise  series. 

The  third  test  called  the  cummulative  periodogram  test 
checks  if  a  seasonal  pattern  exists  in  the  residuals  which 
would  indicate  that  the  model  does  not  explain  the  behavior 
of  the  data  as  well  as  it  should.  A  seasonal  pattern  in  the 
residuals  will  be  indicated  by  periodicity  in  the  residual 
series,  and  it  can  be  detected  by  the  use  of  periodogram. 
The  periodogram,  I(f.)  of  a  time  series  e  ,  t  =  1,  2,  . . 

1  L 


n  is  defined  as 


I(f.)  =  ^  (  (£  e  Cos  2-rrf.t)2  +  (  2  e  Sin  27Tf.t)2] 
1  n  t=l  t  1  t=l  t  1 


n 


where  f.  =  -  is  the  frequency  and  i  =  1,  2, 


n 


Since  the  cumulative  spectrum,  P(f),  of  white  noise,  if 

plotted  against  f,  is  a  straight  line  running  from  (0,0)  to 

(0.5, a  ),  (or  P(f)  /  a  )  is  a  straight  line  running  from 
6  6 

(0,0)  to  (0.5,1),  and  since  1(f)  provides  an  unbiased 
estimate  of  the  power  spectrum  at  frequency  f,  the  cumulat¬ 
ive  periodogram  divided  by  the  observed  error  (or  residual) 
2  2  ~  2 

variance  s  (where  s  =  a  )  is  an  estimate  of  P(f.)  /a 

C  0  0  0 


C(fj) 


•£nI(f.) 

i=l  l 


ns 


Detailed  explanation  about  this  can  be  found  in  Box  and 
Jenkins  [6],  C(f^)  is  known  as  the  normalized  cumulative 

periodogram.  If  the  fitted  model  adequately  accounts  for 
seasonal  patterns  so  that  no  seasonal  pattern  exists  in  the 
residuals,  the  plot  of  C(f_. )  against  f  .will  be  scattered 
around  the  straight  line  joining  the  points  (0,0)  and 
(0.5,1).  Using  the  Kolmogrow  -  Smirnov  test,  as  explained 
in  Hald  [22],  to  draw  95%  confidence  limits  for  the  plot, 
the  probability  lines  should  not  be  crossed  more  than  5% 


o 


f  the  time  if  the  residuals  behave  in  a  white  noise 


fashion. 
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These  three  tests  were  applied  to  the  residuals  of 
the  two  models  developed  in  section  3.4  and  the  results  are 
the  following: 

(a)  NO  Hourly  Averages 

X 

(i)  a  2  =  1.202 

z 

d  2  =  0.444 

e 

Variance  due  to  model  (differencing  in  this 
case)  is  0.758 

(ii)  Calculated  Chi  -  Square  is  Q  =  63.68  for 

n/4  =  83  degrees  of  freedom.  The  upper  5% 

critical  point  for  Chi  -  Square  with  83 

degrees  of  freedom  is  105.27. 

(iii)  Figure  3.7  demonstrates  the  result  of  the 

cumulative  periodogram  test.  The  points 

scattered  along  the  theoretical  line  and  are 

all  within  95%  confidence  limits. 

2  ~  2 

In  test  (i) ,  d  =  d  because  the  model  is  ARMA(0,0) .  How- 

e  w 

ever,  the  main  interest  is  in  the  behavior  of  z^.  Since 
~  2  ~  2  ~  2 

a  =  a  <  a  it  can  be  concluded  that  the  model  written 
e  w  z 

in  terms  of  z  explains  the  behavior  of  the  transformed 
nourly  data  well  enough.  In  test  (ii) ,  the  fact  that  the 
calculated  Chi  -  Square  is  less  than  the  theoretical  one, 
implies  that  rg  (k)  =  0,  for  k  >  0,  hence  the  residuals  con¬ 
stitute  a  white  noise  process.  Test  (iii)  also  shows  that 
the  residuals  constitute  a  white  noise  process,  and  that 
there  is  no  seasonal  pattern.  Thus  the  model  fitted  to  the 
hourly  data  in  section  3.4  is  finally  accepted. 


< 


. 
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FREQ 

Fig.  3.7  Cumulative  Periodogram  for  the  residuals 

of  the  model  fitted  to  the  transformed  NO^ 
Hourly  Averages 


53 


(b) 


Daily 

Maxima  of 

NO 

X 

(i) 

~  2 
o  = 

z 

1.577 

a  2  = 
w 

0.981 

5  2  = 
e 

0.772 

Variance  due  to  model  (considering  w^)  is 
0.209. 

Variance  due  to  model  (considering  z^)  is 
0.805. 


(ii)  Calculated  Chi  -  Square  is  Q  =  66.52  for 

[(n/4)-  2]  =  89  degrees  of  freedom.  The 
upper  5%  critical  point  for  Chi  -  Square 
with  89  degrees  of  freedom  is  112.02. 


(iii)  Figure  3.8  is  the  result  of  the  cumulative 

periodogram  test.  The  plots  scatter  along 

the  theoretical  line  and  are  all  within  the 

95%  confidence  limits. 

2  ~  2  ~  2 

Since  in  test  (i)  a  <  a  <  a  ,in  test  (ii)  the  calcul- 
ated  Chi  -  Square  is  less  than  the  theoretical  one,  and  in 
test  (iii)  the  cumulative  periodogram  test  indicates  no 
seasonal  pattern,  the  model  fitted  to  the  transformed  daily 
maxima  of  NO^  in  section  3.4  can  be  finally  accepted. 

Since  the  models  have  passed  through  the  diagnostic 
checks  and  are  found  adequate  for  explaining  the  behavior 
of  the  corresponding  series,  they  can  be  used  for  forecasting. 
The  models  will  then  be  used  to  forecast  NO  concentrations 
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in  the  next  section. 

3 . 6  Forecasting  with  the  Stochastic  Models 

From  the  models  established  in  section  3.4  and  sub¬ 
jected  to  diagnostic  checks  in  section  3.5,  forecast  funct¬ 
ions  will  be  derived  later  on  in  this  section.  First,  a 
brief  description  of  confidence  limits  of  forecasts  will  be 
given.  Let  2t(L)  be  the  forecast  made  at  origin  t  for  an 
observation  z  ,  which  is  L  steps  ahead.  Then  95%  confi- 

t  '  J-J 

dence  limits  of  the  forecast  for  lead  time  L  are 


z  (L)  +  1.96  /vTL T  , 


(3.6.1) 


where  V(L)  is  the  variance  of  the  L  steps  -  ahead  -  fore¬ 
cast  error  given  by 


V  (L ) 


[1  + 


L-l 

E 

j  =  l 


The  weights  j  =  1,  2,  . .  L  -  1  can  be  obtained  from 

the  parameters  of  the  fitted  model  by  equating  coefficients 
of  powers 


(1  -  a^B  -  .  -  a  BP)  (1  —  B) d  (1  +  ifijB  +  il>2 B2  +  . ) 

=  (1  -  bjB  -  b2B2  -  .  -  b  Bq) 

More  details  about  il>  .  can  be  obtained  from  Box  and  Jenkins 

3 

[6]. 


•  ' 
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The  model  fitted  to  the  transformed  NO,  hourly  averages 

X 

in  section  3.4  is 


z ,  -i  t  e  , 
t-1  t 


and  from  it  we  can  try  to  derive  a  forecast  function.  The 
forecast  at  origin  t  of  an  observation  L  steps  ahead  is  given 
by 

VL>  =  zt  +  et(L) 

Since  e^CL)  is  unknown  it  is  equated  to  its  expected  value 
zero  and  so  the  forecast  function  becomes 

2t(L)  =  zt 

This  forecast  function  shows  that  the  best  forecast  value 
for  all  future  time  is  the  current  value.  Since  the  fore¬ 
casts  from  origin  t  of  the  next  L  hourly  values  of  NO 
supplied  by  this  function  is  the  value  at  the  current  origin 
the  forecast  for  various  L  -  values  is  simply  a  straight 
horizontal  line. 

The  model  fitted  to  the  transformed  daily  maximum  N0X 
series  in  section  3.4  is 
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’t-1 


+ 


0 . 5e 


t-1 


0. 26e 


t-2 


The  forecast  function  derived  from  this  model  is 


*t(L)  =  Zt+L-1  +  et  '  0'5et+L-l  -  °-26et+L-2 


The  terms  on  the  right  of  the  forecast  function  are  treated 

according  to  the  following  four  rules: 

(i)  The  z  .  (j=0,l,2, . ),  which  have  already 

t  ~  3 

happened  at  origin  t,  are  left  unchanged. 

(ii)  The  z  .  (j  =  l,2, . ),  which  have  not  yet 

L.T*  j 

happened,  are  replaced  by  their  forecasts  z^Cj)  at  origin  t. 

(iii)  The  e  .  ( j=0 , 1 , 2 ,......) ,  which  have  happened, 

t  3 

are  available  from  z,  .  -  z,  .  , (1) . 

t-  j  t-  j  - 1 

(iv)  The  et+j  (j  =  l,2, . ),  which  have  not  yet 

happened,  are  replaced  by  zero,  since  zero  is  the  uncondit¬ 
ional  expected  value  of  e^. 

For  lead  1  forecast,  for  example,  the  forecast  funct¬ 
ion  above  can  be  written  as 


zt(l)  =  zt  -  0.5*^  -  0.26et_2  . 

Using  (3.6.1)  the  95%  confidence  interval  of  lead  1  fore¬ 
cast  is  found  to  be  ^t(l)  +  1-72  for  the  transformed  daily 
maxima  of  NO  .  Figure  3.9  shows  the  lead-one  forecasts  of 

X 

the  last  30  values  of  the  transformed  daily  maximum  NO 

X 

series  and  their  95%  confidence  intervals.  All  the  observed 
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Fig.  3.9  Transformed  Daily  Maxima  of  NO  and  lead  - 
one  Forecasts  for  the  last  thirty  days  of 
the  series  using  Stochastic  Model 
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Fig.  3.10  Daily  Maxima  of  NO  and  lead  -  one 
Forecasts  for  the  ¥ast  thirty  days 
of  the  series 
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values  lie  within  the  corresponding  confidence  intervals. 

The  forecasts  ootained  are  transformed  back  to  the  original 
units  by  taking  the  antilogarithm  and  subtracting  0.001. 
These  forecasts  are  presented  in  Table  3.5  and  Figure  3.10 
so  that  both  the  forecasts  and  the  observed  values  can  be 
compared  in  the  original  data  frame.  Table  3.6  contains 
the  last  30  observations  of  the  original  series. 

The  forecasts  of  lead  times  1  to  30  presented  in 
Figure  3.11  also  show  that  all  the  observed  values  lie 
within  their  confidence  intervals.  However,  all  forecasts 
from  lead  -  2  on  have  constant  values.  This  fact  is  dis¬ 
played  on  Figure  3.11  by  the  horizontal  line  traced  by  the 
lead  times  2  to  30  forecast  values.  The  lead  times  1  to 
30  forecasts  are  also  transformed  back  to  the  raw  data  units 
and  are  presented  in  Table  3.7  and  Figure  3.12. 

TABLE  3.5  Lead-one  Forecasts  of  the  last  thirty  days  of  the 
Daily  Maximum  NO  Series  using  Stochastic  Model* 

X 


0.093 

0.077 

0.065 

0.080 

0.042 

0.063 

0.082 

0.050 

0.072 

0.066 

0.072 

0.061 

0.104 

0.042 

0.153 

0.082 

0.124 

0.092 

0.199 

0.163 

0.089 

0.088 

0.099 

0.112 

0.072 

0.076 

0.094 

0.0152 

0.114 

0.114 

*  Read  across  the  page.  The  forecasts  are  in  ppm. 


. 


TABLE  3.6  The  last  thirty  days  of  the  Daily  Maximum 


NO  Series* 
x 


0.153 

0.072 

0.094 

0.027 

0.053 

0.121 

0.042 

0.073 

0.074 

0.081 

0.057 

0.156 

0.028 

0.281 

0.117 

0.163 

0.098 

0.381 

0.279 

0.058 

0.050 

0.083 

0.124 

0.052 

0.054 

0.100 

0.281 

0.151 

0.113 

0.132 

*  Read  across  the  page.  The  observations  are  in  ppm. 


TABLE  3.7  Leads  one  to  thirty  Forecasts  of  the  last  thirty 
days  of  Daily  Maximum  NO  Series  using 

X 

Stochastic  Model* 


0.093 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

*  Read 

across  the 

page . 

The  forecasts  are 

in  ppm. 
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Fig.  3.11  Transformed  Daily  Maxima  of  NO  and  leads 
one  to  thirty  days  of  the  series  using 
Stochastic  Model 
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Fig.  3.12  Daily  Maxima  of  NO  and  leads  one  to 
thirty  Forecasts  for  the  last  thirty 
days  of  the  series. 
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The  forecasts  presented  in  the  tables  above , especially 
the  lead-one  forecasts  in  Table  3.5,  indicate  that  the 
stochastic  model  can  be  used  to  predict  NO^  concentration 
since  many  of  the  forecasts  are  close  to  the  observed 
values  in  Table  3.6.  In  Chapter  IV,  the  dynamic  system 
model  will  be  developed  and  forecasts  generated  by  it  can 
be  compared  with  the  stochastic  model  forecasts. 


CHAPTER  IV 


DYNAMIC  SYSTEM  MODEL 

4 . 1  Definition 

The  Dynamic  System  Model  developed  for  oxides  of 
nitrogen  in  this  chapter  is  known  as  a  combined  transfer 
function-noise  model.  In  developing  the  model ,  the  infor¬ 
mation  supplied  by  the  observed  NO  series  will  be  combined 

X 

with  the  information  supplied  by  the  series  of  factors 
influencing  the  behavior  of  NO^  in  the  urban  atmosphere. 

The  observed  values  of  NO^  are  regarded  as  output  from  a 
dynamic  system  while  temperature,  wind  speed,  and  traffic 
flow  are  factors  conjectured  to  influence  NC>x  in  the  urban 
atmosphere  and  are  regarded  as  imputs  (or  leading  indicators 
as  economists  know  them)  to  the  system.  The  dynamic  system 
in  this  case  is  the  urban  atmosphere  and  Figure  4.1  shows  a 
simple  representation  of  it. 

Since  three  measurable  imputs  to  the  system  have  been 
conjectured  for  this  work  the  sort  of  transfer  function 
model  that  can  represent  the  system  is  known  as  a  multiple 
input  transfer  function  model  whose  general  form  may  be 
written  as 

(1  -  B)  Zt  =  v1(B)  (1  -  B)  Xx  t  +  . 

+  v  (B)  (1  -  B)  +  E  (4.1.1) 

m  m  f  t  t 

where  X.  ,  X9  ,  . .  X  are  m  series  of  input 

1.  L  f  c.  ill  f  L 
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Fig.  4.1  Urban  Atmosphere  as  a  Dynamic  System 
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factors,  Zt  is  the  output  series,  and  is  the  series  of 
noise  disturbance  (or  imput  to  the  system  from  unknown 
sources)  not  to  be  confused  with  the  previously  mentioned 
"white  noise".  v^(B)  is  an  operator  such  that 


are  referred  to  as  the  impulse 


response  function  for  imput  X.  .  B  is  again  a  backward 


shift  operator  defined  as  BX.  =  X. 

1  f  L  1  f  L""  JL 

The  model  (4.1.1)  relates  incremental  change  in  the 
output  Zt  to  the  incremental  changes  in  the  input  factors. 
The  model  as  it  is  written  involves  a  general  assumption 
that  each  input  factor  requires  only  the  first  difference 
to  make  the  process  generating  it  stationary,  but  some  may 
require  higher  differencing.  Cases  where  the  type  of  dif¬ 
ferencing  varies  among  the  input  factors  can  be  handled  and 
an  example  of  it  is  discussed  in  section  4.2.  For  sim¬ 
plicity  the  assumption  is  introduced  in  this  general 
definition . 

The  function  (4.1.1)  can  be  expressed  as  follows: 


(1  -  B)  zt  =  a“1(B)  b± 


(B)  (1  -  B)  Xx, t_d  +  . . . 


+  a”1 (B)  b  (B)  (1  -  B)  X  .  , 

m  m  m,t-d 


m 


(4,1.2) 


+  E 


t' 
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where  a.(B)  and  b . (B) 
3  3 

series  as  follows : 


are  operators  defined  for  each  input 


a .  (B)  =  (1  -  a  .  ,  B  -  . -  a.  BP)  , 

3  3 r  1  3  /P 


bj  (B>  =  (bj,0  *  bj,lB  -  .  -  bj,sBS)- 


An  incremental  change  in  the  level  of  one  input  may  not 

have  an  immediate  effect  on  the  output  level.  This  time 

th 

delay  for  the  j  input  factor  known  as  delay  response 
time  is  represented  by  d_.  .  For  a  system  with  only  one 
input,  the  model  can  be  written  in  the  form  (4.1.1)  and 
(4.1.2)  as  follows: 

(1  -  B)  Z  =  (v  +  VjB  +  v2B2  +  ...)  (1  -  B)  X  +  E 

=  v  (B)  (1  -  B)  Xfc  +  Et 

=  a'1  (B)b(B)  (1  -  B)  Xt_d  +  Et, 


where  parameters  a^  and  b^  are  obtainable  by  equating 
coefficients  of  B  in  the  equation 


v (B)  =  a-1  (B) b  (B)  , 


-  a2B  - 


-  a  BP)  (vA  +  v,B2  + 

p  0  1 


bQ  -  blB  - 


-  b  B2)  Bd 
s 


i.e.  (1  -  a^B 


) 


s 


69 


This  equation  gives  the  following  relations  from  which  a^ 

and  b.  could  be  calculated  if  the  v.'s  are  known: 

1  3 


v  . 
3 


=  0, 


j  <  d, 


v  .  = 
3 

v  .  = 


b 


0 


j  =  d. 


a, v .  ,  +  a~  v .  ~  + 
1  3-I  2  j-2 


+  a  v  .  -  b  .  _  , 

P  3-P  3“d' 


j=d+l,  d+2, 


,  d  +  s 


v . 
3 


a,v.  ,  +  a0v.  n  + 
1  3-I  2  j-2 


j  >  d  +  s . 


+  a  v . 

P  3-P 


t 


(4.1.3) 


Assuming  that  the  impulse  response  function  v^  ,  j=0,  1,2, 


is  known,  then  p,  and  s,  the  number  of  the  a^(i  =  1, 


2,  . .  p)  parameters,  the  b^  (i  =  1,  2,  . .  s) 

parameters,  respectively  can  be  determined  from  the  v^'s. 
The  delay  response  time  d  may  easily  be  determined  from 
the  first  expression  in  (4.1.3).  Let  the  absolute  value 
of  Vj  be  a  maximum  at  j  =  m,  then  s  can  be  taken  to  be  any 
value  from  0,1,  . .  m  -  d.  If  there  are  several  con¬ 

secutive  maxima,  m  takes  on  the  value  j  of  the  last  maxi¬ 
mum.  The  value  of  p  can  be  determined  by  looking  at  v^ 

from  v  onwards.  From  v  the  v.'s  behave  like  the 
m  m3 

autocorrelation  function  of  an  autoregressive  process 
described  in  Chapter  III;  hence,  their  pattern  can  be  used 
to  identify  the  number  of  a^  parameters  in  the  model.  In 
practice  the  number  of  autoregressive  parameters  usually 


does  not  exceed  two,  i.e.. 


p  <  2 . 


There  are  no 


. 
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restrictions  on  the  magnitudes  of  b^,  b^,  . ,  b^ , 

whereas  the  parameters  a^,  a^ ,  . ,  a^  have  to  obey 

stationarity  rules  analogous  to  those  of  autoregressive 
parameters  discussed  in  section  3.2.  One  of  these  rules 
is  that  a^  +  a^  ......  +  a^  <  1.0,  which  greatly  helps 

at  the  model  identification  stage. 

The  steps  involved  in  building  a  dynamic  system  model 

are  very  similar  to  the  ones  outlined  in  Chapter  III  for  the 
stochastic  model.  They  are  in  order,  the  identification 
step,  including  initial  estimates  of  the  parameters,  the 
estimation  step,  the  diagnostic  checking  step  and  the 
forecasting  step;  and  will  be  dealt  with  in  sections  4.2, 
4.3,  4.4,  and  4.5,  respectively. 

Before  building  the  model,  the  input  factors 
should  be  described  briefly.  The  temperature  and  the  wind 
speed  data  are  measurements  taken  at  the  Edmonton  Indus¬ 
trial  Airport.  The  hourly  temperature  is  presented  as 
Series  C  while  the  daily  maximum  temperature  is  presented 
as  Series  D.  All  temperature  observations  are  given  in 
degrees  Fahrenheit.  Series  E  contains  the  hourly  wind 
speed  data  and  Series  F  is  the  daily  wind  speed  data, 
all  measured  in  mph.  The  traffic  flow  data  are  the  ob¬ 
servations  made  on  Jasper  Avenue  and  122  Street  in 
Edmonton  which  is  3  blocks  north  and  13  blocks  west  of  the 
NO^  monitoring  station.  Series  G  consists  of  the  hourly 
data  while  Series  H  consists  of  the  daily  data  of  the 


total  number  of  vehicles. 


■ 
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All  the  data  in  Series  C  to  H  correspond  to  the  same 
periods  relevant  to  the  NC>x  data  in  Series  A  and  Series  B. 
Thus  the  hourly  data  are  observed  for  the  period  February 
22  to  March  7 ,  1967  and  the  daily  data  for  the  period 
April  1,  1971  to  March  31,  1972.  However  the  data  avail¬ 
able  for  the  input  factors  leave  something  to  be  desired 
in  the  sense  that  they  are  not  collected  where  NO^  is 
being  monitored.  Since  the  input  factors  are  monitored 
far  away  from  the  NO  monitoring  station  they  may  fail 
to  give  any  explanation  about  the  NO  behavior.  As  the 
results  of  the  transfer  function  identification,  discussed 
in  the  next  section,  shows,  only  temperature  helps  to 
explain  the  behavior  of  NO^  in  the  city. 


4.2  Dynamic  System  Model  -  Identification 

In  this  section  the  transfer  function  model  connecting 

output  NO^  to  each  of  the  input  factors  will  be  identified 

wherever  it  is  found  that  there  is  a  dynamic  relationship 

between  the  output  and  the  particular  input.  Since  the 

correlation  function  is  used  extensively,  the  correlation 

function  and  its  sample  estimator  are  defined  below. 

The  crosscorrelation  function  p  (k) ,  of  input  X 

xz  *c. 

and  output  Z  at  lag  k  is  given  by 


PxZ(k)  =  E^(xt  lV  (Zt+k  tJZ)  J  for  k  =  0,  1,  2, 

ax  aZ 


*  I 
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and 


xZ 


(k)  =  Et(xt+k  -  V  (zt  -  V“ 


G  °r, 

x  Z 


for  k  =  -1,  -2, 


where  Ux  is  the  mean  of  the  process  Xfc,  and  Uz  is  the  mean 
of  the  process  Z^.  As  the  theoretical  crosscorrelation  is 
unknown  the  observed  sample  corsscorrelation  function  is 
used  for  identification.  Usually  the  crosscorrelation  at 

lags  k  =  0/  1,  2,  . .  is  the  required  part  of  the 

function  needed  for  identification  of  the  transfer 
function,  hence  that  part  of  the  observed  crosscorrelation 
function  is  given  here.  The  observed  sample  crosscorre¬ 
lation  function  r  (k) ,  is  defined  as 

xz  ' 


r  (k)  =  C  (k)/s  s  , 

XZV  '  XZ  '  X  Z 


(4.2.1) 


1 


N-k 


where  Cxz(k)  =  N  l  <xt  -  x)  (zt+k 


-  z)  ,  k  =  0 ,  1,  2, 


•  •  r 


s  =  /c  (oT, 

X  XX 


s  =  /C  (0)  , 
z  zz 


x  is  the  observed  mean  of  the  x^  series,  and  z  is  the 
observed  mean  of  the  z^  series .  From  now  on  the  observed 
sample  crosscorrelation  function  of  x^.  and  z^  will  be 
referred  to  simply  as  the  crosscorrelation  function  and 
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will  be  written  as  r(k)  . 

The  impulse  response  function  v^  ,  j  =  0,  1,  2,  . , 

relating  the  output  z  to  the  input  x  can  be  obtained  from 

the  crosscorrelation  function  of  x,  and  z ..  One  method 

t  t 

of  obtaining  the  impulse  response  function  involves  solving 
a  set  of  linear  equations  for  v^ .  This  is  like  solving 
the  Yule  -  Walker  equations  (3.3.1)  in  section  3.3,  where 
crosscovariances  from  lag  zero  to  a  conjectured  lag  K  are 
substituted  for  autocorrelations  on  the  right  side  of  the 
equations,  and  autocovariances  of  the  input  series  are 
substituted  for  the  autocorrelations  on  the  left  side. 

This  approach  is  defective  in  that  it  requires  the  know¬ 
ledge  of  a  lag  K  beyond  which  v.  is  effectively  zero. 

In  the  preferred  second  method  which  is  used  in  the 
algorithm  (Appendix  D)  for  the  identification  of  the 
transfer  function  model,  this  lag  K  problem  is  not  en¬ 
countered  because  the  crosscorrelation  function,  hence 
Vj  decays  fast  and  so  K  can  be  set  as  low  as  10.  K  is 
set  arbitrarily  equal  to  14  for  the  algorithm  in  this 
work.  A  brief  discussion  of  the  second  method  is  given r 
in  the  following  paragraph. 

A  stochastic  model  is  fitted  to  the  input  series. 

The  stochastic  model  could  be  any  of  the  model  types 
(3.1.3)  discussed  in  section  3.1.  The  stochastic  model 
is  then  used  to  prewhiten  (i.e.  to  obtain  residuals  of 
input  x^)  the  input  series  {x^}  such  that  the  residuals 
y^_  obtained  are  a  white  noise  process.  The  same  model 


I  <\ 


. 
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fitted  to  the  input  series  is  used  to  transform  the  output 
series  {z^_}  such  that  some  sort  of  residuals  w^_,  say, 
which  are  not  necessarily  white  noise  will  be  obtained. 

The  crosscorrelation  function  r(k)  of  the  input  residual 
yt  and  the  output  residual  w^  obtained  by  this  procedure 

is  estimated  according  to  (4.2.1)  for  k  =  0,  1,  . .  K 

where  for  practical  purposes  K  14 .  Then  the  impulse 
response  function  is  given  by 

r  (k)  s 

=  - —  '  for  k=0,  1,  . ,  K .  (4.2.2) 

s 

y 

After  obtaining  the  impulse  response  function  in  this 

way,  the  parameters  a^  (i  =  1,  2,  . .  p)  and  b^ 

(i  =  0,  1,  . .  s)  can  be  obtained  according  to  (4.1.3). 

Since  as  defined  in  (4.2.2)  is  directly  propor¬ 
tional  to  r(k),  r(k)  can  be  used  to  decide  the  signifi¬ 
cance  of  v^ .  If  r(k)  is  significantly  different  from 
zero  at  lag  k,  it  can  be  concluded  that  is  signifi¬ 
cantly  different  from  zero  at  lag  k.  Since  r(k)  has 
approximately  a  normal  distribution,  twice  the  observed 
standard  deviation  of  r(k)  can  be  used  to  test  whether 
or  not  r(k)  is  significantly  different  from  zero  with 
approximately  95%  probability.  Furthermore  one  of  the 
series  (i.e.  residuals  y  )  used  to  estimate  r(k)  is  known 
to  be  white  noise,  so  that  the  standard  deviation  (STD) 


« 
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of  r(k)  is  given  by 

STD  =  — — - —  (4.2.3) 

/ (n-k) 

More  details  about  the  distribution  of  r(k)  are  given  in 
Bartlett  [4],  Further  details  of  the  identification 
procedure  for  the  transfer  function  model  using  the  cross- 
correlation  function  can  be  found  in  Box  and  Jenkins [6] 
while  the  use  of  cross  spectral  analysis  for  the  same 
purpose  can  be  found  in  Jenkins  and  Watts  [28]. 

In  the  rest  of  this  section  the  results  obtained 
by  following  the  above  identification  procedure, 
which  is  the  basis  of  the  transfer  function  model  iden¬ 
tification  algorithm  in  Appendix  D,  will  be  presented. 

The  algorithm  (implemented  in  Fortran  IV;  see  program  4 
in  Owolabi  [52])  is  used  to  identify  transfer  function 
models  for  the  NO  hourly  average  series,  and  the 
daily  maximum  NO  series.  For  the  purpose  of  the  analysis 

X 

here  and  the  rest  of  the  chapter,  the  input  series  and  the 
output  series  are  transformed  according  to  the  transfor¬ 
mations  presented  in  Table  4.1.  From  now  on,  reference 
to  any  one  of  these  series  implies  reference  to  its 


trans  formation . 


. 

' 
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TABLE  4 . 1 


Raw  Data  Transformation 


Series 

Raw  Data 

Trans  formation 

NO 

X 

zt 

ln(z^  + 

0.001) 

Temperature 

xt 

ln(xt  + 

50.0) 

Wind  Speed 

xt 

In  (xt) 

Traffic  Flow 

xt 

In  (xt) 

NO^  Hourly  Average 

Figure  4.2  shows  the  plot  of  hourly  NO  series  with 

X 

the  corresponding  hourly  temperature  series,  wind  speed 
series,  and  traffic  flow  series.  The  plotted  series  are 
the  transformations  of  the  data  in  series  A  for  NO 

X 

hourly  averages,  series  C  for  hourly  temperature,  series 
E  for  hourly  wind  speed,  and  series  G  for  hourly  traffic 
flow.  The  relationship  of  NO  hourly  averages  with  any 
of  the  input  factors  is  not  obvious  from  the  plot.  How¬ 
ever,  the  possible  transfer  function  relating  NO  to  each 
of  them  is  investigated  using  the  procedure  emphasized 
earlier  in  this  section  and  the  algorithm  in  Appendix  D. 

The  stochastic  model  which  was  found  to  fit  the 
hourly  temperature  series  is  a  modified  ARMA  (2,0)  . 

Using  the  stochastic  model  fitting  method  of  Chapter  III, 
the  hourly  temperature  model  turns  out  to  be 
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Fig.  4.2  Transformed  hourly  Inputs  and  Output 
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(1  -  0.338B  -  0.211B2)  (1  -  B)  Xfc  =  y  (4.2.4) 

This  model  is  used  to  "prewhiten"  the  temperature  series 
such  that  the  remaining  series  of  residuals  yt  is  merely 
white  noise  and,  hence,  have  zero  autocorrelation.  The 
same  model  (4.2.4)  is  used  to  transform  the  z  series, 
the  resulting  series  is  of  course  not  necessarily  un¬ 
correlated.  The  estimated  crosscorrelation  function 
r(k) ,  of  the  two  new  series  is  plotted  in  Figure  4.3 
and  its  values  with  the  standard  deviation  STD,  are  given 
in  Table  4.2.  An  estimate  of  the  impulse  response  function 
vk  is  obtained  from  the  r(k)  according  to  (4.2.2).  At 
this  point,  the  significance  of  r(k)  implies  the  signi¬ 
ficance  of  v^.  Comparing  r(k),  k  =  0,  1,  . .  14  with 

twice  its  standard  deviation  [i.e.  ±2  (STD)  J,  in  Table 
4.2,  it  is  found  that  only  r(3)  may  be  considered  as 
significantly  different  from  zero.  r(3)  =  0.107  is  com¬ 
paratively  close  to  twice  its  standard  deviation  of 
0.110.  All  other  r(k),  k  =)=  3  are  not  as  close  to 
twice  their  standard  deviations  as  r(3). 

This  means  the  effect  of  an  incremental  change  in  the 
level  of  hourly  temperature  is  delayed  two  hours  before 
it  is  apparent  in  the  level  of  hourly  NO  concentration 

X 

A 

in  the  urban  atmosphere.  Since  only  v^  is  significantly 

/\  A 

different  from  zero,  b^  =  v^  =  2.82  and  b^  is  the  only 
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parameter  in  the  transfer  function.  Therefore,  the 
transfer  function  describing  hourly  NO^  in  terms  of 

TABLE  4.2  Crosscorrelation  and  Impulse  Response 

Functions  of  prewhitened  hourly 
Temperature  and  transformed  NO^ 


Lag  k 

r  (k) 

STD  of  r(k) 

2  x  (STD) 

vk 

0 

0.022 

0.055 

0.110 

0.588 

1 

0.005 

0.055 

0.110 

0.119 

2 

-0.049 

0.055 

0.110 

-1.289 

3 

0.107 

0.055 

0.110 

2.824 

4 

0.046 

0.055 

0.110 

1.202 

5 

-0.048 

0.055 

0.110 

-1.272 

6 

0.015 

0.055 

0.110 

0.397 

7 

0.101 

0.055 

0.110 

2.675 

8 

-0.015 

0.055 

0.110 

-0.388 

9 

0.049 

0.055 

0.110 

-1.300 

10 

-0.075 

0.055 

0.110 

-1.988 

11 

0.004 

0.056 

0.112 

0.114 

12 

-0.055 

0.056 

0.112 

-1.441 

13 

0.016 

0.056 

0.112 

0.435 

14 

-0.014 

0.056 

0.112 

-0.370 

s  =  0.724 
w 


s 

y 


0.027 
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Fig.  4.3  Crosscorrelation  Function  of  prewhitened 
hourly  Temperature  and  transformed  NO, 
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Fig.  4.4  Crosscorrelation  Function  of  prewhitened 
hourly  Wind  Speed  and  transformed  NO 
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hourly  temperature  alone  can  be  written  as 

(1  -  B)  Z  =  bQ  (1  -  B)  Xt_3.  (4.2.5) 

The  model  fitted  to  the  hourly  wind  speed  series  is 
a  modified  ARMA  (3,0)  and  has  the  following  form: 

(1  +  0.239B  +  0.139B2  +  0.165B3)  (1  -  B)  X 

=  yt  (4.2.6) 

The  hourly  wind  speed  series  is  prewhitened  and  the 
hourly  NO^  series  transformed  according  to  this  wind 
speed  model  .  The  crosscorrelations  of  the  two  residuals 
yt  and  wfc  are  estimated  and  presented  in  Table  4.3  and 
Figure  4.4.  As  before  Table  4.3  also  contains  estimates 
of  and  the  standard  deviation  (STD)  of  r(k) .  From 
Table  4.3  it  can  be  seen  that  only  r(10)  is  significantly 
different  from  zero  when  compared  with  twice  its  standard 
deviation  (i.e.  r(10)  =  -0.116  ±  0.110).  However,  the  fact 
that  the  significance  of  r(10)  is  a  border  line  case 
coupled  with  the  fact  that  lag  10  is  such  a  large  lag 
makes  the  significance  of  r(10)  doubtful.  The  parameter 

/V  /s 

b0  =  v10  obtained  by  the  identification  procedure  was 
subjected  to  the  efficient  estimation  method  of  section 
4.3  and  b^  so  produced  was  found  to  be  not  significantly 
different  from  zero,  and,  hence,  was  omitted  from  the 
model.  This  means  that  a  change  in  the  hourly  wind  speed 
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TABLE  4.3  Crosscorrelation  and  Impulse  Response 

Functions  of  prewhitened  hourly  Wind  Speed 
and  transformed  NO 

x 


Lag  k 

r  (k) 

STD  of  r(k) 

2  x  (STD) 

vk 

0 

-0.022 

0.055 

0.110 

-0.049 

1 

-0.045 

0.055 

0.110 

-0.098 

2 

-0.032 

0.055 

0.110 

-0.071 

3 

-0.095 

0.055 

0.110 

-0.209 

4 

-0.004 

0.055 

0.110 

-0.009 

5 

-0.026 

0.055 

0.110 

-0.056 

6 

0.090 

0.055 

0.110 

0.198 

7 

0.075 

0.055 

0.110 

0.165 

8 

0.041 

0.055 

0.110 

0.090 

9 

0.059 

0.055 

0.110 

0.129 

10 

-0.116 

0.055 

0.110 

-0.254 

11 

-0.073 

0.056 

0.112 

-0.160 

12 

-0.078 

0.056 

0.112 

-0.170 

13 

-0.067 

0.056 

0.112 

-0.148 

14 

0.088 

0.056 

0.112 

0.194 

s  =  0.681 
w 

s 

y 


0.311 
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does  not  seem  to  influence  hourly  changes  in  the  NO^ 
level.  The  wind  speed  -  ignoring  direction  -  does  not 
seem  to  influence  NO^  since  the  wind  direction  is  probably 
an  important  characteristic  affecting  the  NO^  level.  In 
fact  it  has  been  shown  in  the  "Air  Pollution  Summary 
Edmonton"  published  by  the  Environmental  Health  Services 
Division  of  Alberta  Department  of  Health  that  south  and 
southwest  winds  are  correlated  with  high  NO^  concentration 
in  the  Administration  Building. 

The  hourly  traffic  flow  series  is  described  by  a 
seasonal  model  with  a  24  -  hour  period.  The  model  is 

(1  -  0.266B)  (1  -  B24)  Xt  =  y  +  0.568yt_1.  (4.2.7) 

As  for  temperature  and  wind  this  model  (4.2.7)  is  used 
to  prewhiten  the  hourly  traffic  flow  series  and  to  trans¬ 
form  the  NO^  hourly  average  series.  The  crosscorrelations 
of  the  two  new  series  are  given  in  Table  4.4  and  plotted 
in  Figure  4.5.  Comparing  r(k) 's  with  ±2 (STD)  in  Table 
4.4  it  can  be  seen  that  none  of  the  r(k) 's  are  significantly 
different  from  zero.  This  implies  that  an  incremental 
change  in  the  level  of  hourly  traffic  flow  has  no  effect 
on  hourly  NO^  concentration.  The  result  here  is  quite 
astonishing  in  view  of  the  fact  that  automobile  exhaust 
constitutes  a  source  of  NO  in  urban  areas.  However, 

X 

two  reasons  may  account  for  this  apparent  lack  of  auto¬ 
mobile  effect  here.  One  is  that  most  automobiles  nowadays 
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Crosscorrelation  and  Impulse  Response 
Functions  of  prewhitened  hourly  Traffic 
Flow  and  transformed  NO 

x 


r(k) 

STD  of  r(k) 

2  x  (STD) 

vk 

0.044 

0.057 

0.114 

0.145 

-0.055 

0.057 

0.114 

-0.181 

0.039 

0.057 

0.114 

0.126 

-0.082 

0.057 

0,114 

-0.269 

-0.007 

0.057 

0.114 

-0.024 

-0.005 

0.057 

0.114 

-0.026 

-0.009 

0.057 

0.114 

-0.031 

-0.003 

0.057 

0.114 

-0.011 

-0.054 

0.057 

0.114 

-0.177 

0.021 

0.057 

0.114 

0.069 

-0.018 

0.058 

0.116 

-0.059 

0.037 

0.058 

0.116 

-0.122 

0.034 

0.058 

0.116 

0.111 

-0 .016 

0.058 

0.116 

-0.052 

0.073 

0.058 

0.116 

0.241 

0.996 


0.304 
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have  pollution  control  devices.  The  other  and  perhaps  the 
most  important  in  this  case  is  that  the  traffic  flow  data 
available  are  collected  far  away  from  the  Administration 
Building  where  NO  is  monitored. 

X 

It  has  been  found  that  only  hourly  temperature  is  re¬ 
lated  to  hourly  NO  .  In  the  following  discussion  a  possible 

X 

transfer  function  connecting  daily  maxima  of  NO  with  each 

X 

of  the  input  factors  will  be  identified. 

Daily  Maxima  of  NO 
2  x 

The  daily  maximum  NO  series  and  the  daily  temperature, 

X 

wind,  and  traffic  flow  series  are  subjected  to  the  same  type 
of  analysis  as  the  hourly  data.  Series  D  is  the  daily  maxi¬ 
mum  temperature  series,  series  F  is  the  daily  mean  wind  speed 
series,  and  series  H  is  the  daily  traffic  flow  series.  The 
daily  maxima  of  NO  series  is  of  course  Series  B.  The  trans- 
formation  of  the  daily  data  is  identical  to  the  transforma¬ 
tion  used  for  the  hourly  data,  see  Table  4.1.  The  trans¬ 
formed  daily  data  are  plotted  in  Figure  4.6.  Again  the 
relationship  of  NO  with  any  of  the  three  input  factors  is 
not  obvious  from  Figure  4.6,  however  only  a  visual  compari¬ 
son  is  of  little  value  here,  as  the  hourly  data  indicated. 

The  stochastic  model  fitted  to  the  daily  maximum 
temperature  is 

(1  -  0.19B  +  0.183B2)  (1  -  B)  =  yfc 


(4.2.8) 
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After  using  the  model  to  prewhiten  the  daily  temperature 

series  and  transform  the  NO  series  the  crosscorrelation 

x 

function  of  the  two  resulting  series  are  estimated.  The 
estimated  crosscorrelation  function  r(k)  is  plotted  in 
Figure  4.7  and  its  values  with  the  standard  deviation 
(STD)  are  given  in  Table  4.5.  An  estimate  of  the  impulse 
response  function  v^  is  obtained  from  r(k)  according  to 

(4.2.2).  Comparing  r(k),  k  =  0,  1,  . ,  14  with  twice 

its  standard  deviation  [i.e.  ±2 (STD) ]  in  Table  4.5  it 
can  be  seen  that  r(0)  is  significantly  different  from  zero 
since  r(0)  ±2  STD  -  0.113  ±0.104  do  not  include  zero,  r(2) 

can  be  regarded  at  first  sight  as  being  significantly 
different  from  zero.  However,  results  from  efficient 
estimation  of  parameters  in  section  4.3  indicate  that 

r(2)  is  not  significantly  different  from  zero.  So  only 

/*\ 

one  impulse  response  weight  is  available  and  bQ  =  Vq  =  1.43. 
The  transfer  function  model  identified  for  temperature  alone 
is 


(1  -  B)  Z  -  b ( 1  -  B)  Xt 


(4.2.9) 


The  stochastic  model  fitted  to  the  daily  mean  wind 
speed  is 


(1  +  0.436B  +  0.337B2)  (1  -  B)  =  yfc. 


(4.2.10) 
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Crosscorrelation  Function  of  prewhitened 
daily  Temperature  and  transformed  NO 

X 


o 

o 

• 

o 

CC 
dJ 
CJ 
CO 
DO  □ 
CD  CD 

cc  • 

CJ  i 


-10.00 


Fig.  4.8 


-+ 


-5.00 


0.00 

LflG-K 


5.00 


10.00 


Crosscorrelation  Function  of  prewhitened 
daily  Wind  Speed  and  transformed  NO. 


15.00 


V 


, T~ 

dJ 

CJ  „ 

CO 
CO  0 
O  o 

*— - 1 - 1 - 1 - 1 - 1 - 1  I  1—  1  I 

u  '-10  00  -5.00  0.00  5.00  10.00  15.00 

LflG-K 

Fig.  4.9  Crosscorrelation  Function  of  prewhitened 
daily  Traffic  Flow  and  transformed  NO^. 


TABLE  4.5 


89 


Crosscorrelation  and  Impulse  Response 
Functions  of  prewhitened  daily  Temperature 
and  transformed  NO 

x 


Lag  k 

r  (k) 

STD  of  r(k) 

2  x  (STD) 

vk 

0 

0.171 

0.052 

0.104 

1.426 

1 

-0.067 

0.052 

0.104 

-0.554 

2 

0.113 

0.052 

0.104 

0.941 

3 

-0.043 

0.053 

0.106 

-0.355 

4 

-0.044 

0.053 

0.106 

-0.362 

5 

0.022 

0.053 

0.106 

0.183 

6 

-0.072 

0.053 

0.106 

-0.602 

7 

0.022 

0.053 

0.106 

0.179 

8 

0.074 

0.053 

0.106 

0.620 

9 

-0.053 

0.053 

0.106 

-0.437 

10 

0.029 

0.053 

0.106 

0.241 

11 

0.069 

0.053 

0.106 

0.575 

12 

-0.091 

0.053 

0.106 

-0.753 

13 

0.006 

0.053 

0.106 

0.050 

14 

0.048 

0.053 

0.106 

0.400 

s 


y 


1.058 


0.127 


90 


After  the  usual  prewhitening  and  transformation  operations 
using  model  (4.2.10) ,  the  r(k) 's  for  the  new  series  ob¬ 
tained  are  estimated  along  with  the  v^'s.  Table  4.6 
contains  the  r(k) 's  with  their  standard  deviations  and 
the  v^.  '  s .  Figure  4.8  shows  the  r(k)  's.  From  Table  4.6, 
it  can  be  seen  that  none  of  the  cross  correlations  are 
significantly  different  from  zero,  hence  as  in  the  hourly 
situation , the  daily  mean  wind  speed  fails  to  contribute 
significantly  to  the  behavior  of  the  daily  maximum  level 
of  NO  in  the  city. 

The  daily  traffic  flow  series  is  described  by  a 
seasonal  model  with  a  7  -  day  period.  The  model  is 

(1  -  0.276B)  (1  -  B7)  X  =  y  .  (4.2.11) 

As  for  temperature  and  wind  this  model  (4.2.11)  is  used  to 
prewhiten  the  daily  traffic  flow  series  and  to  transform 
the  NO^  daily  average  series.  The  crosscorrelations  of  the 
two  new  series  are  given  in  Table  4.7  and  plotted  in  Figure 
4.9.  Comparing  r(k) 's  with  2 (STD)  in  Table  4.7  it  can  be 
seen  that  none  of  the  r(k)'s  are  significantly  different 
from  zero.  This  means  no  transfer  function  connects  the 
daily  maximum  NO^  with  the  daily  traffic  flow. 

Since  the  foregoing  show  that  there  is  no  justification 
for  including  both  wind  speed  and  traffic  flow  in  the 
transfer  function  model  for  NO  ,  the  combined  transfer 

X 

function  -  noise  model  for  either  hourly  or  daily  series 
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Crosscorrelation  and  Impulse  Response 
Functions  of  prewhitened  daily  Wind  Speed 
and  transformed  NO 

x 


r(k) 

STD  of  r(k) 

2  x  (STD) 

vk 

-o . 036 

0.052 

0.104 

-0.055 

0.000 

0.052 

0.104 

0.001 

0.024 

0.052 

0.104 

0.057 

-0.025 

0.053 

0.106 

-0.059 

0.081 

0.053 

0.106 

0.19  3 

0.085 

0.053 

0.106 

0.201 

-0.007 

0.053 

0.106 

-0.018 

0.016 

0.053 

0.106 

0.039 

-0.029 

0.053 

0.106 

-0.069 

-0.044 

0.053 

0.106 

-0.104 

-0.006 

0.053 

0.106 

-0.015 

-0.013 

0.053 

0.106 

-0.031 

-0.005 

0.053 

0.106 

-0.011 

-0.025 

0.053 

0.106 

-0.060 

0.025 

0.053 

0.106 

0.059 

0.911 


0.382 
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Crosscorrelation  and  Impulse  Response 
Function  of  prewhitened  Traffic  Flow  and 
transformed  NO 

x 


r(k) 

STD  of  r (k) 

2  x  (STD) 

vk 

0.066 

0.053 

0.106 

0.900 

0.010 

0.053 

0.106 

0.134 

-0.059 

0.053 

0.106 

-0.803 

0.087 

0.053 

0.106 

1.183 

-0.039 

0.053 

0.106 

-0.525 

0.059 

0.053 

0.106 

0.795 

0.079 

0.053 

0.106 

1.071 

-0.029 

0.053 

0.106 

-0.396 

0.094 

0.053 

0.106 

1.270 

0.054 

0.053 

0.106 

0.727 

0.038 

0.054 

0.108 

0.519 

0.078 

0.054 

0.108 

1.054 

0.047 

0.054 

0.108 

0.631 

-0.012 

0.054 

0.108 

-0.163 

0.021 

0.054 

0.108 

0.279 

1.180 


0.087 


will  include  only  temperature  Xt  and  the  noise  term  E  . 
Thus,  using  (4.2.5),  the  combined  transfer  function  - 

noise  model  for  NO  is 

x 

(1  -  B)  Z t  =  b(l  -  B)  Xt_s  +  Et  .  (4.2.12) 

For  the  hourly  series  when  the  initial  estimate  of  b  is 

/\ 

b  =  2.82,  and  from  (4.2.9)  ft  is 

/\ 

(1  -  B)  Z.  =  b ( 1  -  B)  X  +  E  (4.2.13) 

t  t  t 

For  the  daily  series  where  the  initial  estimate  of  b 

/\ 

is  b  =  1.43. 

The  stochastic  models  identified  for  E  in  each  of 
(4.1.12)  and  (4.2.13)  are  of  the  form: 

Ef  =  et  -  hxet-l  “  ll2et-2  ~  h3et~3'  anc^  (4.2.14) 

Et  =  Gt  hlSt-l  ”  h2et-2'  (4.2.15) 

respectively.  The  parameters  fm  (i  =  1,2, . )  for  each 

of  these  models  will  be  estimated  along  with  the  parameter 
of  the  corresponding  transfer  function  in  the  next  section. 

Before  the  parameter  estimation  stage  it  is  necessary 
to  describe  briefly  at  this  point  what  the  combined  trans¬ 
fer  function  -  noise  model  should  have  looked  like  if 
any  of  the  models ( 4 . 2 . 12)  and  (4.2.13)  had  at  least  two 
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input  factors  instead  of  just  one.  This  is  necessary  for 
this  approach  to  be  generally  useful  for  pollution  models 
although  the  data  does  not  justify  more  than  one  input 
in  this  thesis.  Let  the  daily  series, for  example, have 
two  input  factors  related  to  it  instead  of  one.  Then  we 
would  have  two  identified  transfer  functions  for  the  out¬ 
put  series.  Let  traffic  flow  be  the  second  input  factor 
so  related  where  the  relationship  is  expressible  simply  as 

(1  -  B7)  Z  =  b ( 1  -  B7)  Xt  .  (4.2.16) 

Then  combining  (4.2.9)  for  daily  temperature  with  (4.2.16) 
for  daily  traffic  flow,  the  two  identified  input  transfer 
functions  could  have  been  combined  as  follows: 

=>  Zt  =  (1  -  B)"1  b1(l  -  B)  X±  t 

+  (1  -  B7)-1  b2(l  -  B7)  X2  t  (4.2.17) 

k]_(l  ”  B)  Xi,t  i  b2^1  ”  B  ^  X2,t 

=>  zi-  ~  +  7 

(I  -  B)  (1  -  B') 

=>  (1  -  B)  (1  -  B7)  Z  =  b1(l  -  B)  (1  -  B7)  X1  . 

+  b2(l  -  B)  (1  -  B7)  X2  t 


. 
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*>  (1  -  B  -  B7  +  B8)  Z  t  =  b1(l  -  B  -  B7  +  B8)  XJL 


+  b2 (1  -  B  -  B7  +  B8)  X2  t 


:>  Zt  =  Zt-1  +  Zt-7  -  Zt-8  +  (1  '  B  -  B7  +  B8)  Xx 


+  fc>2  (1  -  B  -  B7  +  B8)  X2  t 


The  function  (4.2.17)  can  be  generalized  for  m  imput 
factors  where  m  >  2 . 

Let  the  estimate  of  by  (4.2.9)  be 


2-i.  z  ,+b,(l  B)  X,  .  _ 

1/t  t-1  1  l,t  and  estimate  of 

Zt  by  (4.2.16)  be 


Z2,t  Zt-7 
then  the  series  E 


+  b2 (1  -  B  )  x2  t 
.  can  be  obtained  from 


E,  =  z,  -  z,  ,  -  z0  , 
t  t  1 ,  t  2  ,  t 


(4.2.18) 


and  then  a  stochastic  model  can  be  identified  for  series 
E^.  Let  the  stochastic  model  identified  for  E^  be  of  the 
form 


E t  =  h (B)  et. 


(4.2.19) 


then  combining  (4.2.17)  and  (4.2.19)  we  have 
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Zt  =  (1  -  B)  1  b1(l  -  B)  X1  t  +  (1  -  B7)  1b2(l  -  B7)X2(t 

+  h (B)  e  ,  (4.2.20) 

which  is  known  as  the  combined  transfer  function-noise  model. 
Once  the  model  has  been  identified  in  this  form,  the  para¬ 
meters  have  to  be  estimated  together  in  order  that  the 
non-significant  parameters  can  be  detected  and  rejected. 

An  algorithm  that  can  produce  efficient  estimates  of 
the  parameters  will  be  discussed  in  the  next  section. 

4 . 3  Dynamic  System  Model  -  Estimation 

In  this  section  efficient  estimation  of  the  para¬ 
meters  of  the  combined  transfer  function-noise  models 
identified  in  section  4.2  will  be  made.  The  estimates 
can  be  obtained  easily  by  the  use  of  the  Marquardt 
algorithm  for  nonlinear  least  squares.  The  theoretical 
basis  of  the  algorithm  is  stated  by  Marquardt  [36]  and 
the  practical  application  of  it  is  briefly  indicated 
in  Box  and  Jenkins  [6].  The  algorithm  supplies  the  para¬ 
meter  estimates  that  minimize  the  residual  sum  of  squares, 
the  covariances  of  the  parameters,  and  the  variance  of 
the  residuals  generated  by  the  optimal  parameter  values. 

The  algorithm,  described  in  detail  in  Appendix  E,  causes 
rapidi  convergence  (in  most  cases  within  ten  iterations, 
provided  a  minimum  exists)  to  the  optimal  values  even  if 
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zeros  are  given  as  initial  values  of  the  parameters.  As 
for  the  other  algorithms  in  this  work,  a  Fortran  IV 
program  (see  program  5  in  Owolabi  [52])  is  written  to 
implement  this  algorithm.  The  rest  of  this  section 
consists  of  the  results  obtained  from  using  this  algorithm 
for  the  hourly  and  daily  models. 

NO  Hourly  Average 

X 

The  combined  transfer  function-noise  model 
identified  for  the  hourly  series  in  section  4.2,  see 
(4.2.12) ,  is 

(1  -  B)  Z  =  b  ( 1  -  B)  X  -  +  E  . 
t  t  ^ 

Considering  E^_  as  an  ARMA  (0,3)  model,  the  above  can  be 
written  as 

(1  -  B)  Z  =  b  ( 1  -  B)  Xt_3  +  et-h1et_1  (4.3.1) 

^2et-2  ~  ^3et-3' 

where  h1#  h2,  h3  are  the  moving  average  parameters  for 
the  stochastic  model  of  the  E^_  series,  and  b  is  known  from 
section  4.2  as  the  only  transfer  function  parameter.  From 

this  model  ( 4 . 3.1)  the  residual  series,  ,  t  —  1,2, . . 

n  can  be  generated  as 
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et  ^  Zt  b^1  Xt-3  +  hlet-l  +  h2et-2 


+  h3et-3 


for  different  sets  of  values  of  b,  h^,  h^,  h in  order  for 
the  algorithm  to  determine  the  optimum  set  of  parameter 
values  .  As  in  Chapter  III,  the  initial,  unknown  e^'s 

are  set  to  zero.  Since  b  can  assume  any  real  value  as 
stated  in  section  4.1  and  h^,  ,  h^,  being  moving 

average  model  parameters,  have  to  obey  the  invertibility 
conditions  discussed  in  section  3.2,  b  can  quite  arbit¬ 
rarily  be  set  to  1.0  and  h^,  h^t  h^  set  to  0.0  as  the 
initial  parameter  values  in  the  Marquardt  algorithm  (i.e. 
the  program  thereof).  With  these  initial  values,  the 
following  set  of  optimal  parameter  values  is  determined: 

b  is  2.501  with  standard  deviation  of  1.032, 
h^  is  0.088  with  standard  deviation  of  0.055, 

is  0.050  with  standard  deviation  of  0.055, 

h^  is  0.125  with  standard  deviation  of  0.055, 

where  the  corresponding  variance  of  the  residuals  is 
0.  405. 

If  the  optimal  estimates  of  the  parameters  obtained 
are  compared  with  twice  their  standard  deviations,  it  can 
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be  seen  that  only  b  and  are  significantly  different 
from  zero.  Therefore  the  combined  transfer  function  - 
noise  model  relating  hourly  oxides  of  nitrogen  to  tempera¬ 
ture  can  be  parsimoniously  expressed  as 


(1  -  B)  Z  =  2.501(1  -  B)  X.  0  +  e  -  0.1256,.  (4.3.2) 

t  t-3  t  t-3 

Daily  Maximum  NO 
2_ x 

Most  of  the  comments  made  above  in  connection  with 
the  hourly  model  estimation  apply  to  the  daily  model  and 
so  they  will  not  be  repeated.  The  identification  procedure 

gives  temperature  as  the  only  leading  indicator;  hence,  the 
combined  transfer  function  -  noise  model  identified  for 

the  daily  series  can  be  written  from  (4.2.12)  and  (4.2.15) 
as 

(1  -  B)  Zt  =  b ( 1  -  B)  Xt  +  et  -  h1et_1  -  h2et_2 . ( 4 . 3 . 3 ) 

From  (4.3.3)  the  residual  series  efc  can  be  generated 
according  to 


e  =  (1  -  B)  Zt  -  b ( 1  -  B)  Xt  +  h1et_1  +  h2et-2* 


The  initial  values  of  the  parameters  supplied  to  the 
program  of  the  Marquardt  algorithm  are  arbitrarily  chosen 
=  1.0,  h  =  h~  =  0.0.  The  program  produces  the 
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following  optimal  estimates  of  the  parameters: 

b  is  0.971  with  standard  deviation  of  0.344, 
h-^  is  0.491  with  standard  deviation  of  0.052  , 
h.2  is  0.268  with  standard  deviation  of  0.052, 

where  the  variance  of  the  residuals  is  0.742. 

All  three  parameters  are  significantly  different  from 
zero  when  compared  with  twice  their  standard  deviations. 
Hence,  the  combined  transfer  function  -  noise  model 
relating  daily  maximum  NO  to  daily  maximum  temperature 

X 

is  given  by 

(1  -  B)  Z  =  0.971  (1  -  B)  Xt  +  et  -  0.491et_x  (4.3.4) 

-  0.268et_2* 

Before  the  two  models  established  here  are  accepted 
as  the  final  dynamic  system  models  suitable  for  represent¬ 
ing  the  behavior  of  oxides  of  nitrogen  in  the  urban  atmos¬ 
phere,  they  must  be  submitted  to  careful  diagnostic 
checking.  The  diagnostic  checks  are  applied  in  the  next 
section . 

4 . 4  Dynamic  System  -  Diagnostic  Checks 

Comparing  the  residual  variances  obtained  in  the  last 
section  with  the  residual  variances  obtained  through  the 
stochastic  models  fitted  in  Chapter  III  and  shown  in 
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Table  4.8,  it  can  be  seen  that  the  dynamic  models  are 
as  good  as  the  stochastic  models  if  not  better,  because 
their  residual  variances  are  smaller  than  residual  variances 
of  the  corresponding  stochastic  models. 

TABLE  4.8  Variance  Comparison  of  Stochastic  and 

Dynamic  System  Models 


NO 

X 

Series 

Variance  of 

Transformed 

Data 

Stochastic 

Model ' s 

Residual 

Variance 

Dynamic  System 

Model ' s 

Residual 

Variance 

Hourly 

Averages 

1.202 

0.522 

0.405 

Daily 

Maxima 

1.577 

0.772 

0.742 

The  first  test  therefore  concerns  the  variances  of 
the  residuals  which  are  found  to  be  smaller  than  the 
variances  of  the  transformed  data  shown  in  Table  4.8. 

Since  the  variances  of  the  residuals  are  very  small  it  can 
be  concluded  that  the  dynamic  models  are  adequate. 

One  other  test  may  be  applied  for  the  diagnostic 
check  of  the  dynamic  system  models.  This  test  uses  the 


; ' 
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Chi  -  square  test  as  discussed  in  section  3.5  where  the 
Chi  -  square  random  variable  Q  is  a  function  of  the 
autocorrelation  function  of  the  residuals  according  to 

(3.4.1)  .  The  observed  Chi  -  square  value  Q  of  the  residual 
autocorrelations  corresponding  to  (4.3.2)  is  73.25.  The 
theoretical  upper  critical  Chi  -  square  value  at  95%  for 

81  degrees  of  freedom  is  103.01.  Thus  we  accept  the 
hypothesis  that  the  residuals  behave  in  a  white  noise 
fashion,  implying  that  they  are  uncorrelated.  For  the 
daily  maximum  NO  ,  the  observed  Chi  -  square  value  is 
Q  =  65.26  while  the  theoretical  upper  critical  Chi  - 
square  value  at  95%  for  88  degrees  of  freedom  is  110.90. 
This  also  indicates  that  the  residuals  of  (4.3.3)  seem  to 
behave  like  white  noise.  These  tests  show  that  the 
combined  transfer  function  -  noise  models  for  both  the 
hourly  averages  and  daily  maxima  of  NO  as  given  by 

(4.3.2)  and  (4.3.3),  respectively  explain  the  NO  behavior 
well.  Thus  the  models  can  be  used  for  forecasting  the 
oxides  of  nitrogen  concentration  in  the  urban  atmosphere. 

4 . 5  Forecasting  with  the  Dynamic  System  Models 

NO^  Hourly  Average 

To  forecast  zfc(L)  (i.e.  transformed  hourly  average  of 

NO  )  at  L  hours  ahead  of  a  time  origin  t,  the  forecast 
x 

function  corresponding  to  the  hourly  model  (4.3.2)  is 


103 


zt(L) 


t+L- 1  +  2*501  (xt+L-3  Xt+L-4} 


0.125  e 


t+L- 3 


(4.5.1) 


The  terms  on 


z 


t+j 


( 

( 

( 


x 


t+j 


( 

( 

( 


e 


t+j 


( 

( 

( 


the  right 

hand  side  of 

Zt+j 

j  l  o 

Vj) 

j  >  0  , 

x^_  ,  . 
t+3 

j  1  0 

Vj) 

V 

o 

et+j 

j  i  o 

0 

V 

o 

• 

(4.5.1)  are  given  as: 


(4.5.2) 


xt(j)  is  calculated  from  model  (4.2.4)  fitted  to  the  input 
series  as: 


x  (j)  =  1.338  x,  .  -j  -  0.127  x,  .  0  -  0.211  x ,  ,  .  0 
t  t+]-l  t+j-2  t+3-3 


and  e^  is  calculated  from  e  =  7.  -  z^_^_  (1)  • 

Using  (4.5.1)  the  lead  -  one  and  leads  1-30 
forecasts  of  the  last  30  hours  for  the  hourly  series  are 
obtained.  Figure  4.10  shows  the  plot  of  these  forecasts 
with  their  95%  confidence  limits  and  also  the  corresponding 
observed  series.  From  Figure  4.10  it  can  be  seen  that  all 

the  observed  values  fall  within  95%  confidence  limits  of 

/\ 

their  lead  -  one  forecasts  which  is  z^  (1)  ±  1.247  in  the 


LOG  (NO*) 

,-12.00  -10.00  -8.00  -6.00  -14.00  -2.00  0.00  2.00 
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TIME 

Transformed  hourly  NO  and  Lead  -  one 
Forecasts  using  Dynamic  System  Model 


Fig.  4 . 10 
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logarithmic  units.  For  the  purpose  of  comparison  with 
the  raw  data,  the  forecasts  are  transformed  back  to  the 
raw  data  units  by  taking  the  antilog  and  subtracting 
0.001.  This  transformed  forecasts  is  presented  in  Table 
4.9  while  the  corresponding  observed  raw  data  is  pre¬ 
sented  in  Table  4.10.  It  can  be  seen  from  the  two  tables 
that  most  of  the  lead  -  one  forecasts  are  close  to  the 
observed  values. 

The  leads  1  to  30  forecasts  with  their  95%  con¬ 
fidence  limits,  and  also  the  corresponding  observed 
series  are  presented  in  Figure  4.11  in  logarithmic 
units .  Although  none  of  the  observed  values  fall  out¬ 
side  the  confidence  limits,  the  confidence  limit  itself 
is  so  wide  that  none  could  have  fallen  outside  it.  How¬ 
ever  this  model  can  be  used  to  forecast  hourly  NO  three 
hours  ahead.  This  is  an  improvement  over  the  stochastic 
model  for  the  hourly  series  which  can  forecast  reliably 
only  one  hour  ahead.  The  forecasts  here  also  are  trans¬ 
formed  back  to  the  raw  data  unit  and  presented  in  Table 


4.11. 


-12 . 00  -10.00  -8.00  -6.00  -4.00  -2.00  0.00  2.00 


106 


.00  306.00  316.00  326.00  336.00 

TIME 

Transformed  hourly  NO  and  Leads  1  to  30 
Forecasts  Using  Dynamic  System  Model 


Fig.  4 . 11 


TABLE  4.9 


Lead  -  One  Forecasts  of  the  last  thirty 
hours  of  NO^  Hourly  Averages  Series  using 
Dynamic  System  Model  * 


107 


0.017 

0.018 

0.039 

0.05  8 

0.042 

0.028 

0.006 

0.003 

0.003 

0.003 

0.003 

0.003 

0.003 

0.003 

0.007 

0.011 

0.007 

0.008 

0.004 

0.013 

0.016 

0.010 

0.012 

0.002 

0.002 

0.003 

0.002 

0.002 

0.001 

0.000 

*  Read 

across  the 

page.  The 

unit  of  measurement 

is  ppm. 

TABLE  4 

.10  The 

last  thirty 

hours  of  NO 

X 

Hourly 

Averages 

Series  * 

0.022 

0.040 

0.060 

0.050 

0.030 

0.006 

0.003 

0.003 

0.003 

0.003 

0.003 

0.003 

0.003 

0.007 

0.010 

0.007 

0.008 

0.003 

0.011 

0.012 

0.011 

0.002 

0.002 

0.002 

0.002 

0.002 

0.002 

0.001 

0.000 

0.000 

*  Read  across  the  page.  The  unit  of  measurement  is  ppm. 
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TABLE  4.11  Leads  1  to  30  Forecasts  of  the  last  thirty 

hours  of  NO  Hourly  Averages  Series  using 

X 

Dynamic  System  Model  * 


0.017 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0  .014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.014 

0.034 

0.014 

0.014 

0.014 

*  Read  across  the  page.  The  unit  of  measurement  is  ppm. 


Daily  Maximum  NO^ 

Using  the  model  (4.3.3)  for  the  daily  maxima  of 
NO^,  the  forecast  function  for  the  daily  maximum  series 
is  given  by 


zt(L)  Zt+L-1  +  0,971  (xt+L  Xt+L-1)  0,491  et_1 

-  0.268  et_2  ,  (4.5.3) 


where  (4.5.2)  can  be  used  to  obtain  zt+j  / 

e  .  while  xji)  is  calculated  from  model 
t+j  t 

to  the  input  series  as : 


x 


t+j  ' 


(4.2.8) 


and 

fitted 
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Xt(j)  1,19  Xt+j-l  °*373  xt+j-2  +  0,183  xt+L-3  * 

Function  (4.5.3)  is  used  to  obtain  lead  -  one  and 
leads  1  to  30  forecasts  of  the  last  30  days  of  the  daily 
maximum  series.  Figure  4.12  shows  the  plot  of  these 
forecasts  with  their  95%  confidence  limits  and  also  the 
corresponding  observed  series.  From  Figure  4.12  it 
can  be  seen  that  all  the  observed  values  fall  within  95% 
confidence  limits  of  their  lead  -  one  units.  For  the 
purpose  of  comparison  with  the  raw  data  observed,  the 
forecasts  are  transformed  back  to  the  raw  data  units 
by  taking  the  antilog  and  subtracting  0.001.  This  trans¬ 
formed  forecast  is  presented  in  Table  4.12  while  the 
corresponding  observed  raw  data  are  presented  in  Table 
4.13.  It  can  be  seen  from  the  two  tables  that  most  of 
the  lead  -  one  forecasts  are  close  to  the  observed 
values . 

The  leads  1  to  30  forecasts  with  their  95%  con¬ 
fidence  limits,  and  also  the  corresponding  observed 
series  are  presented  in  Figure  4.13  in  logarithmic  units. 

As  shown  in  Figure  4.13  all  the  observations  fall  within 
the  95%  confidence  limits  of  their  corresponding  forecasts. 
The  forecast  function  here,  as  in  the  corresponding 
stochastic  model  forecast  function,  can  predict  reliably 
two  days  ahead.  However  the  forecast  values  here  are 
closer  to  the  raw  observations  than  the  forecast  values 
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TABLE  4.12  Lead  -  one  Forecasts  of  the  last  thirty 

days  of  Daily  Maxima  of  NO  Series  usiny 

X 

Dynamic  System  Models  * 


0.094 

0.079 

0.065 

0.082 

0.042 

0.063 

0.083 

0.047 

0.075 

0.068 

0.073 

0.060 

0.105 

0.043 

0.156 

0.080 

0.125 

0.090 

0.201 

0.167 

0.087 

0.088 

0.097 

0.114 

0.071 

0.077 

0.095 

0.152 

0.115 

0.114 

*  Read 

across  the 

page.  Unit 

of  measurement  is  ppm. 

TABLE  4 

. 1 3  The 

last  thirty 

days  of  Daily 

Maxima  of 

NO 

X 

Series  * 

0.153 

0.072 

0.094 

0.027 

0.053 

0.121 

0.042 

0.073 

0.074 

0.081 

0.057 

0.156 

0.028 

0.281 

0.117 

0.163 

0.098 

0.381 

0.279 

0.058 

0.050 

0.083 

0.124 

0.052 

0.054 

0.100 

0.281 

0.151 

0.113 

0.132 

*  Read 

across  the 

page .  The 

unit  of  measurement  is 

ppm. 
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TABLE  4.14  Lead  Times  1  to  30  Forecasts  of  last  thirty 

days  of  Daily  Maxima  of  NO  Series  using 

X 

Dynamic  System  Model  * 


0.094 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

0.060 

*  Read  across  the  page.  The  unit  of  measurement  is  ppm. 

There  are  no  dynamic  system  models  developed  for 
Calgary,  Windsor,  Sarnia,  Toronto,  and  Sudbury  because 
the  necessary  additional  data  for  wind , traffic , and  temperature 
are  not  available.  However,  daily  maximum  series  of  oxides 
of  nitrogen  are  available  for  these  five  cities  and  are  used 
to  develop  stochastic  models  for  oxides  of  nitrogen  in  those 
cities  in  Chapter  V. 


. 
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CHAPTER  V 


COMPARATIVE  STUDY  OF  NO  IN  SOME  OTHER  CITIES 

X 

5 . 1  Stochastic  Models  for  Five  other  Cities 

In  this  chapter  the  daily  maximum  NO^  stochastic 
model  as  developed  for  Edmonton  in  Chapter  III  is  compared 
with  the  same  type  of'  model  established  here  for  Calgary, 
Sarnia,  Sudbury,  Toronto,  and  Windsor.  As  will  be  apparent 
shortly  the  models  for  the  six  cities  are  all  of  the 
ARIMA  (0,  1,  2)  type,  and  hence  an  attempt  is  made  to 
establish  a  general  ARIMA  (0,  1,  2)  model  which  may  be 
able  to  explain  the  behavior  of  the  daily  maximum  of  NO^ 
in  any  urban  center.  It  should  be  noted  that  the  stochas¬ 
tic  model  established  for  Edmonton's  daily  maximum  NO 

A 

series  is  also  an  ARIMA  (0,  1,  2) ,  but  written  (for  the 
purpose  of  analysis) in  its  modified  ARMA  (0,2)  form, 
because  the  model  is  expressed  as 

wt  =  et  ~  ^1  et-l  ~  ^2  et-2  ’ 
where  wfc  =  (1  -  B)  zt  =  zt  ~  zt-l  * 

The  ARIMA  form  will  be  used  throughout  this  chapter  because 
it  shows  the  degree  of  differencing. 

To  save  space, data  employed  for  the  development  of  the 
stochastic  ..model  for  the  five  cities  are  not  included  in 
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this  report.  The  interested  reader  can  obtain  them  from 
the  sources  mentioned  in  Chapter  II . 

The  various  series  of  the  original  daily  NO^  data 
are  all  transformed  by  first  adding  0.001  ppm  and  then 
taking  the  natural  logarithm,  thus  resulting  in  the 
transformed  series  zt's.  The  same  model  building  procedure 
as  outlined  in  Chapter  III  is  used  for  Calgary  and  each  of 
the  four  Ontario  cities.  The  final  stochastic  model  and 
relevant  parameter  estimates  are  described  briefly  below. 

Calgary  NO^ 

The  daily  maximum  NO  data  available  for  Calgary 

X 

consists  of  730  observations  (for  1970,  1971)  which  is 
quite  an  adequate  series  for  stochastic  modelling.  The 
model  is  found  to  be  ARIMA  (0,  1,  2)  and  this  is  of  the 
form 

(1  -  B)  Z  =  (1  -  b±  B  -  b2  B2)  et  ,  (5.1) 


where 


b  =  0.564  with  standard  deviation  of  0.031, 
b  =  0.214  with  standard  deviation  of  0.036, 


with  corresponding  residual  variance  of  0.294.  The 
variance  estimate  of  the  transformed  series  is  0.492. 
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Sarnia  NO 

x 

The  1971  daily  maximum  NO^.  data  from  Sarnia  consist 
of  365  observations.  The  model  that  describes  the  process 
from  which  this  series  is  observed  is  found  to  be  an 
ARIMA  (0,  1,  2)  of  the  form  (5.1) ,  where 

b-^  =  0.563  with  standard  deviation  of  0.043  , 
b2  =  0.326  with  standard  deviation  of  0.050, 

and  the  residual  variance  is  estimated  to  be  0.204.  The 
variance  estimate  of  the  transformed  series  is  0.329. 

Sudbury  NO 

X 

The  daily  maximum  NO^  series  from  Sudbury  consist 
of  observations  for  only  seven  months,  namely  January 
to  July  1971.  The  212  consecutive  observations  again  in¬ 
dicate  an  ARIMA  (0,  1,  2)  model  as  the  best  fitting  one, 
where 


b-^  -  0.566  with  standard  deviation  of  0.057. 
b2  =  0.414  with  standard  deviation  of  0.063, 

and  the  residual  variance  is  estimated  to  be  0.031.  The 
variance  estimate  of  the  transformed  series  is  0.360. 


. 


117 


Toronto  NO 

x 

The  1971  daily  maximum  NO  data  available  from 

u  x 

Toronto  contained  many  "holes"  which  were  filled  in 
according  to  the  method  proposed  in  section  2.1.  After 
'patching'  the  data  the  usual  stochastic  model  building 
procedure  is  employed  for  the  resulting  365  observations. 
An  ARIMA  (0,  1,  2)  model  is  found  to  fit  this  Toronto 
series,  where 

b^  =  0.600  with  standard  deviation  of  0.042, 

b^  -  0.280  with  standard  deviation  of  0.050, 

and  residual  variance  is  estimated  to  be  0.296.  The 
variance  estimate  of  the  transformed  series  is  0.392. 

Windsor  NO 

x 

The  daily  maximum  NO  data  in  Windsor  is  a  series  of 
consecutive  observations  for  8  months;  namely  January 
to  August,  1971.  An  ARIMA  (0,  1,  2)  model  is  obtained 
for  those  243  observations  also  where 

b1  =  0.447  with  standard  deviation  of  0.057, 

b2  =  0.323  with  standard  deviation  of  0.061, 

and  residual  variance  is  estimated  to  be  0.224.  The 
variance  estimate  of  the  transformed  series  is  0.342. 
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5 . 2  Result 

In  all  the  six  cities  considered  in  this  study  the 
stochastic  process  generating  daily  maximum  NO  is 
found  to  be  of  the  ARIMA  (0,  1,  2)  type,  that  is 


Z 


t 


't-l 


+ 


e 


t-l 


e 


t-l 


(5.2) 


The  parameter  estimates,  ,  b^  ,  for  the  different 
cities  differ  only  slightly.  This  does  not  imply  that 
the  NO  level  in  the  different  cities  are  the  same.  As  a 

X 

matter  of  fact  the  estimated  mean  of  the  transformed  series 
for  the  different  cities  vary.  If  the  models  were  fitted 
on  the  transformed  series  without  differencing,  the  dif¬ 
ferent  means  would  have  shown  up  different  constant 
terms.  However,  since  the  ARIMA  model  is  fitted  on  the 
first  difference  of  the  transformed  series  (for  stationarity 
reasons)  the  mean  of  the  first  difference  is  zero  as  ex¬ 
plained  in  Section  3.2,  thus  making  the  constant  term  in 
the  model  zero.  Table  5.1  summarizing  the  comparison  of 
the  six  models,  shows  that  Toronto  has  a  higher  NO  pol- 
lution  average  than  any  of  the  other  cities  and  Edmonton 
NO^  average  is  the  lowest,  although  its  NOx  variation  is 
more  than  three  times  as  large  as  that  of  the  other  cities. 
However,  it  must  be  remembered  that  the  series  for  the 
different  cities  do  not  belong  to  the  same  year  (although 
all  are  within  the  1970  -  72  period)  nor  do  they  have 
equal  length  as  shown  in  section  5.1.  Although  the  Sudbury 


■ 

. 
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TABLE  5.1  Comparison  of  the  stochastic  models  for  daily  maxima  of  NO 
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model  explains  only  16%  of  the  observed  variation  it  is 
the  best  fitting  model  for  this  series.  Since  the 
analysis  has  been  performed  on  the  logarithmic  trans¬ 
formation  of  the  series  as  indicated  in  Chapter  III,  the 
means  and  variances  in  Table  5.1  are  in  their  natural 
logarithms  and  their  antilogarithms  have  to  be  taken  to 
get  their  values  in  ppm  -  the  averages  in  ppm  are  in¬ 
dicated  in  parenthesis.  For  instance  the  average  NO 

X 

for  Edmonton  is  0.016  ppm.  Also  in  Table  5.1  population 
figures  of  the  cities  according  to  the  1971  census  as 
recorded  in  the  1972  issue  of  the  Canada  Yearbook  are 
included . 

The  fact  that  the  best  fitting  model  for  each  city 
turns  out  to  be  the  same  type  with  only  slight  differences 
in  their  parameters  is  extremely  surprising.  Especially, 
since  NO  is  produced  primarily  by  motor  vehicles  and 

X 

industry,  it  would  stand  to  reason  that  in  large  industrial 
urban  areas  the  behavior  of  daily  NO  may  be  different 

X 

from  its  behavior  in  less  densely  populated  urban  areas. 
However,  this  study  supplies  evidence  to  the  contrary. 

Since  the  ARIMA  (0,  1,  2)  models  are  so  similar,  one 
suspects  that  perhaps  a  common  ARIMA  model  exists  for 
all  urban  centers.  This  common  model  should  then  be  the 
ARMA  (0,2)  type  for  the  first  difference  of  the  trans¬ 
formed  data,  that  is  of  the  ARIMA  (0,  1,  2)  type.  To  try 
to  establish  such  a  common  ARIMA  (0,  1,  2)  model,  the 
first  difference  of  the  six  NC>  series  are  pooled  (that  is 


. 
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concatenated)  together  such  that  a  first  difference  series, 
2274  observations  long,  is  obtained.  For  the  pooling, 
variation  over  space  is  considered  rather  than  variation 
over  time.  Thus  the  (1  -  B)  7.  series  for  the  different 
cities  are  regarded  as  samples,  observed  in  the  different 
cities,  of  the  overall  process  (1  -  B)  z  .  The 
ARIMA ( 0 ,  1,  2)  model  fitted  to  this  overall  series  turns 
out  to  be 

(1  -  B)  Zt  =  (1  -  0 . 50  OB  -  0.270B2)  e  ,  (5.3) 

with  corresponding  residual  variance  of  0.347.  To  ensure 
that  this  combined  model  adequately  describes  the  NO^ 
behavior  in  the  individual  cities,  it  must  be  compared  to 
the  six  individual  models  in  Table  5.1.  The  test 
employed  here  is  similar  to  the  'constancy'  test  explained 
by  Huang  [27],  and  the  'coincidence'  test  used  for  re¬ 
gressions  and  described  by  Williams  [78].  This  test  for 
regression  can  be  applied  to  the  ARIMA(0,  1,  2)  models 
because  the  first  difference  of  the  transformed  z  's  are 
regressed  on  the  et's  which  are  white  noise  and  hence 
independent.  Thus  this  model  is  like  a  regression  model 
which  relates  a  dependent  variable  to  two  independent 
variables.  The  test  makes  use  of  the  analysis  of  variance 
as  follows. 

Let  there  be  m  samples  having  n2 . . .  n^ 


. 
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elements  respectively.  Let  n  =  +  .  +  nm  ' 

and  let  there  be  k  estimated  parameters  of  the  model.  If 
SS^  is  the  total  residual  sum  of  squares  due  to  the  over¬ 
all  model  obtained  by  pooling  all  the  samples,  and  SS2 
is  the  sum  of  the  sums  of  squares  within  each  sample,  then 
SS^  =  SS^  -  SS^  is  the  sum  of  sums  of  squares  between  each 
regression  plane  and  the  overall  regression  plane.  It  can 
be  shown  that 

rias  a  Chi  -  square  distribution  with  n  -  k 
degrees  of  freedom, 

has  a  Chi  -  square  distribution  with  n  -  km 
degrees  of  freedom, 

has  a  Chi  -  square  with  km  -  k  degrees  of 
freedom,  and  that 

SS~y(km  -  k) 

— - -  has  an  F  distribution  with  (5.4) 

SS2/(n  -  km) 

(km  -  k)  and  (n  -  km)  degrees 
of  freedom. 

Let  b1  ,  b2  ,  . .  bk  be  the  parameters  of  the 

overall  model,  and  b^  •  '  . '  ^lk  '  ^21  ' 


SS 


SS 


e 


SS 


■ 

' 
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b22  '  . '  b2k  ;  . ;  bml  '  bm2 

for  each  of  the  m  individual  models, 

the  hypothesis  to  be  tested  is  that 


;  be  the  parameters 
respectively,  then 


bll  b21  ~  b31 
b12  “  b22  =  b 32 


blk  b2k  b3k 

If  the  hypothesis  is  true  (i.e.  the  overall  regression 
hyperplane  and  the  individual  regression  hyperplanes  are 
coincident)  then  SS^  =  SS2  ,  and  SS^  is  zero.  The  test 
of  the  hypothesis  (5.5),  therefore,  can  be  considered  as 
a  test  of  significance  of  SS^  .  Thus  under  hypothesis 
(5.5),  the  observed  ratio  (5.4)  can  be  compared  to  the 
upper  5%  critical  point  of  the  F  distribution  with  the 
appropriate  degrees  of  freedom.  If  the  ratio  is  less  than 
the  upper  5%  critical  point,  then  SS^  is  not  significantly 
different  from  zero. 

Assuming  (5.5)  for  the  overall  and  the  six  individual 
ARIMA ( 0 ,  1,  2)  models  respectively,  the  observed  F  -  value 
(5.4)  is  0.609  for  10  and  2263  degrees  of  freedom.  However, 
the  upper  5%  critical  point  of  F  with  10  and  2263  degrees 
of  freedom  is  1.83,  hence  the  hypothesis  (5.5)  is  accepted. 


b  ,  =  b, 
ml  1 


bm2  b2 


(5.5) 


• 

rv 

. 


It  can  be  concluded,  therefore,  that  the  stochastic 
process  generating  daily  maximum  NO  in  these  urban 

X 

centers  can  be  described  by  the  ARIMA ( 0 ,  1,  2)  model, 

Zt  =  Zt_x  +  efc  -  0.500et_1  -  0.270et_2  .  (5.6) 

From  the  above  evidence  it  seems  reasonable  to  conclude 
further,  that  the  process  generating  daily  maximum  NO 

X 

in  any  urban  center  may  be  described  by  the  same 
ARIMA ( 0 ,  1,  2)  model. 

To  test  how  good  this  general  model  will  be  at  fore¬ 
casting,  it  is  used  to  obtain  lead  -  one  forecasts  of 
daily  maximum  NO  for  January  1970  in  Edmonton,  (where 

X 

January  1970  in  Edmonton  is  outside  the  period  used  to 
develop  the  model)  .  Table  5.2  and  Table  5.3  which  con¬ 
tain  the  forecasts  and  the  observed  data  respectively, 
indicate  that  the  forecasts  here  are  as  good  as  those 
obtained  for  the  corresponding  model  in  Chapter  III. 
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TABLE  5.2  Lead  -  one  Forecasts  of  the  Daily  Maxima 

of  NO  for  January  1970  in  Edmonton  * 

X 


0.031 

0.028 

0.031 

0.023 

0.022 

0.032 

0.050 

0.034 

0.059 

0.040 

0.028 

0.027 

0.025 

0.027 

0.034 

0.028 

0.063 

0.027 

0.036 

0.040 

0.069 

0.132 

0.124 

0.134 

0.191 

0.165 

0.129 

0.195 

0.161 

0.134 

0.112 

*  Read 

across  the 

page.  The 

unit  of  measurement  is 

ppm. 

TABLE  5. 

3  Observed  Daily 

Maxima  of 

NO^  for  January 

1970 

in  Edmonton  * 

0.025 

0.031 

0.018 

0.016 

0.039 

0.106 

0.043 

0.096 

0.047 

0.017 

0.016 

0.018 

0.023 

0.042 

0.029 

0.128 

0.026 

0.030 

0.047 

0.136 

0.496 

0.336 

0.242 

0.390 

0.254 

0.118 

0.246 

0.189 

0.109 

0.076 

0.047 

* 


Read  across  the  page.  The  unit  of  measurement  is  ppm. 
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Since  the  general  model  is  conjectured  to  be  adequate 
for  describing  the  daily  behavior  of  NO  in  any  urban  center 

X 

it  can  be  used  to  forecast  daily  maximum  NO^  concentration 
in  any  city.  Further  discussion  and  recommendations  for 
future  work  are  given  in  the  next  chapter. 


CHAPTER  VI 


CONCLUSIONS  AND  RECOMMENDATIONS 

6 . 1  Conclusions 

The  models  established  for  NO  in  this  study  generate 

some  surprising  information  concerning  the  behavior  of  the 

pollutant  in  urban  atmosphere.  One  such  fact  revealed 

by  the  stochastic  models  is  that  the  concentration  of  NO  in 

urban  centers  follows  no  particular  pattern.  For  example, 

one  would  expect  a  7  -  day  cycle  in  the  daily  behavior,  such 

that  the  minimum  concentration  occurs  during  the  week-end 

when  there  is  less  traffic  flow  and  industrial  activities. 

The  fact  that  all  of  the  particular  stochastic  models 

fitting  the  data  from  the  individual  cities,  and  the  general 

stochastic  model  (5.6)  for  urban  centers,  contain  no  seasonal 

term  to  reflect  a  7  -  day  cycle  shows  that  no  such  cycle 

exists .  Other  information  produced  by  stochastic  models  is 

that  NO  behaves  in  the  same  way  in  all  urban  centers, 
x 

As  a  result  it  was  possible  to  establish  a  general 
stochastic  model  (5.6)  to  explain  the  behavior  of  NO  in 
urban  atmosphere.  The  general  stochastic  model  established 
is  ARIMA (0,1,2)  based  on  the  first  difference  of  the 
transformed  data;  hence,  the  difference  in  the  level 
of  concentration  among  the  cities  does  not  affect  the 
parameters  of  the  model. 

The  dynamic  system  models  (combined  transfer  function- 


■ 

. 


128 


noise  models)  show  that  among  the  three  factors  which  were 
conjectured  to  be  capable  of  influencing  the  behavior  of  NO 

X 

in  urban  atmosphere,  only  temperature  had  such  influence. 

The  three  input  factors  considered  were  temperature,  wind 
speed,  and  traffic  flow.  Traffic  flow  was  considered  because 
it  is  known  that  automobile  exhaust  is  a  major  source  of  NO 
in  a  city.  Wind  speed  was  considered  because  atmospheric 
mixing  and  transport  due  to  wind  could  have  an  effect  on  the 
concentration  of  air  pollutants.  Temperature  was  considered 
since  temperature  affects  the  amount  of  energy  to  be  generated 
for  heating,  and  hence,  the  amount  of  NO  emitted  by  power 

X 

plants.  The  transfer  function  connecting  temperature  to 
hourly  behavior  of  NO^  in  the  Edmonton  atmosphere  shows  that 
an  incremental  change  in  temperature  has  an  effect  on  NO 

A 

concentration  after  two  hours.  For  the  daily  behavior,  an 
incremental  change  in  the  daily  temperature  is  reflected  in 
the  NO  maximum  concentration  that  same  day.  Temperature 

X 

observations  in  the  Industrial  Airport  are  known  to  be  fairly 
representative  of  temperature  in  the  city  (See  Hage  and  Longley 
[21]).  However,  the  wind  speed  observed  in  the  same  airport 
and  the  traffic  flow  observed  on  Jasper  avenue  cannot  represent 
the  situation  in  the  whole  city,  and,  thus,  cannot  represent 
the  situation  in  Administration  Building,  where  NO^  is  being 
monitored.  The  remoteness  of  the  observation  stations  of  wind 
speed  and  traffic  flow  from  where  NO.^  is  being  monitored  may 
account  for  the  apparent  lack  of  relationship  between  them  and 
NO  .  It  is  especially  surprising  that  no  relationship  is  found 
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to  exist  between  NO  and  traffic  flow;  as  a  result  the 

x 

dynamic  system  models  do  not  involve  traffic  flow. 

Comparing  the  model  types  developed  in  this  study 
with  diffusion  models  it  can  be  seen  that  stochastic  and 
dynamic  system  models  have  some  advantages  over  diffusion 
models.  The  models  give  forecasts  which  are  close  to  the 
observed  values,  and  unlike  the  reported  characteristic  of 
diffusion  models,  the  models  developed  here  produce  air 
pollutant  concentration  estimates  which  are  neither  con¬ 
sistently  smaller  nor  consistently  greater  than  the 
observations.  In  addition  the  confidence  intervals  of 
the  forecasts  generated  by  the  stochastic  and  the  dynamic 
system  models  could  be  estimated  and  used  as  a  measure  of 
the  reliability  of  the  forecasts.  Thus  it  was  found  that 
the  stochastic  model  for  daily  maxima  of  NO  would  produce 

X 

reliable  forecasts  for  one  and  two  days  ahead.  The  dynamic 
system  model  (4.3.2)  for  hourly  NO  would  produce  reliable 
forecasts  for  one,  two, and  three  days  ahead,  while  the  dynamic 
system  model  (4.3.4)  for  daily  maxima  of  NO  would  produce 

X 

reliable  forecasts  for  one  and  two  days  ahead.  Although 
longer  lead  forecasts  by  these  models  are  tolerable  as 
some  idea  of  future  occurrence,  their  95%  confidence  inter¬ 
vals  are  too  wide;  as  a  result  the  fact  that  all  the 
observations  fall  within  the  95%  confidence  intervals  does 
not,  for  practical  purposes,  prove  the  reliability  of  longer 
lead  forecasts.  The  only  obvious  advantage  of  the  diffusion 
model  over  the  model  types  proposed  in  this  study  is  that  the 
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diffusion  models  can  supply  information  levels  of  a  pollu¬ 
tant  where  no  monitoring  station  exists,  since  the  develop¬ 
ment  of  the  diffusion  model  does  not  require  the  observed 
data  of  the  pollutant  at  the  receptor.  However, the  develop¬ 
ment  of  diffusion  models  require  other  data,  like  source 
inventories  and  meteorological  observations.  Notwithstanding, 
if  necessary  observations  are  available  for  the  development 
of  stochastic  and  dynamic  system  models,  they  are  better 
as  predictive  models  for  air  pollutants  than  diffusion  models. 

Comparing  stochastic  models  with  dynamic  system  models 
one  would  find  that  the  dynamic  system  models  are  prefer¬ 
able.  Dynamic  system  models  can  give  information  about 
the  relationship  connecting  the  pollutant  with  the  influenc¬ 
ing  factor  as  illustrated  in  this  study  in  the  case  of  NO^ 
and  temperature.  Also  the  lack  of  relationship  between  the 

conjectured  input  factor  and  the  output  is  detectable  at  the 
model  development  stage  as  exemplified  in  the  case  of  NO.^ 
and  the  two  input  factors,  traffic  flow,  wind  speed.  In 
addition,  dynamic  system  models  generate  better  forecasts 
in  the  sense  that  the  estimates  they  produce  are  closer  to 
the  observed  values  than  those  produced  by  stochastic  models. 
Generally  dynamic  system  models  can  be  used  as  a  basis  for 
the  control  of  the  system.  However,  the  possible  dynamic 
system  models  established  for  study 

cannot  be  used  for  control  because  ambient  temperature, 
which  is  the  only  input  factor  cannot  be  controlled.  If 
traffic  flow  had  an  obvious  connection  with  NO  through  a 


. 
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transfer  function,  then  a  dynamic  control  function  required 
to  control  the  pollutant's  level  could  have  been  obtained 
from  the  resultant  combined  transfer  function-noise  model. 

The  control  aspect,  however,  lies  outside  the  scope  of  this 
thesis;  further  details  about  control  functions  are  given 
in  Box  and  Jenkins  [6], 

In  general,  the  city  of  Edmonton  still  enjoys  a  clean 
atmosphere.  The  NO^  level  is  usually  below  the  safe  standard 
set  in  many  places.  The  Alberta  clean  air  act  passed  in 
January  1973  stipulated  an  annual  mean  of  0.03  ppm  while 
the  mean  observed  for  daily  maxima  of  NO  over  a  year  in 
this  study  is  0.016  ppm.  Although  some  places  close  to 
power  stations,  refineries,  and  highways  may  experience 
higher  concentrations,  the  concentration  level  for  the  city 
as  measured  at  the  Administration  Building  is  generally 
below  the  safe  level. 

6 . 2  Recommendations 

Precision  and  continuity  of  the  measurements  of  con¬ 
centration  of  air  pollutants  cannot  be  over-emphasized  in 
any  pollution  surveillance  program.  Therefore,  it  is  hereby 
recommended  that  monitoring  stations  obtain  more  reliable 
measuring  instruments  or  that  two  instruments  are  alloted 
to  each  station,  so  that  one  may  be  used  as  a  back-up  device. 
This  would  prevent  loss  of  observations  which  at  present 
is  common  in  monitoring  NO  in  all  the  cities  investigated 

A 


' 

. 


in  this  work.  Wherever  possible  these  instruments  may 
be  interfaced  with  computers  for  accurate  data  recording, 
analysis  and  immediate  reporting. 
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For  the  City  of  Edmonton  more  monitoring  stations  are 
necessary  for  NO  .  A  network  of  stations  would  give  a 
better  overall  picture  of  the  pollutants  level  in  the  city 
than  only  one  station.  In  addition,  input  factors  that  • 
could  affect  NO  concentration  should  be  monitored  at  or 

X 

very  near  the  NO  monitoring  stations. 

X 

To  use  the  different  forecast  functions  developed  in 
Chapters  III,  IV,  and  V  for  the  purpose  of  forecasting  NO  , 
the  minimum  number  of  past  records  required  is  the  highest 
degree  of  backward  shift  operator  B  in  the  function  (or  in 
the  model  which  leads  to  the  forecast  function) ;  for  example 
model  (5.3)  requires  a  minimum  of  2  past  records.  However, 

50  past  records  are  recommended  wherever  possible.  If  the 
recursive  estimation  of  values  is  started  50  points  back, 
loss  of  information  that  may  be  caused  by  equating  the 

initial  error  e^,  to  zero, will  be  compensated  by  the  time  the 

required  future  estimate  is  calculated.  It  should  be 

emphasized  that  the  models  here  fit  certain  transformations 

of  the  data  (see  transformations  in  Table  4.1),  hence  the 

records  used  should  first  be  appropriately  transformed 

[In  (z  +  0.001)  in  the  case  of  NO  ]  before  applying  the 
t  A 

forecast  function.  The  forecast  obtained  then  is  given  as 
a  logarithm  and  it  should  be  transformed  back  to  ppm 

by  taking  its  antilogarithm  and  subtracting  0.001. 
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It  may  be  desirable  to  forecast  the  mean  concentration 
of  a  pollutant  over  a  region  including  many  cities  (  or  a 
city  including  many  monitoring  stations) .  If  reliable 
records  of  the  pollutant  exist  for  stations  all  over  the 
region  they  can  be  used  to  calculate  a  series  of  means  for 
the  whole  region.  The  series  of  regional  means  can  be  re¬ 
garded  as  output  while  the  set  of  local  means  will  be  in¬ 
puts  to  the  system.  Then  a  combined  transfer  function  - 
noise  model  connecting  the  inputs  to  the  output  can  be 
developed,  and  this  will  be  capable  of  generating  forecasts 
of  regional  means.  This  cannot  be  developed  for  Edmonton 
at  present  because  there  is  only  one  monitoring  station. 

Also  it  is  not  tried  for  cities  considered  here  because 
the  series  available  are  not  of  equal  length  and  they  do 
not  cover  the  same  period  of  time. 

The  time  series  analysis  approach  to  pollution  model¬ 
ling  is  described  generally  in  this  study  such  that 
people  interested  in  the  field  of  air  pollution  could  be 
aware  of  its  potentialities.  Although  models  in  this  study 
were  developed  for  NC>x,  it  is  obvious  from  the  discussion 
that  the  same  procedures,  outlined  in  this  study,  could  be 
followed  to  establish  models  that  would  be  able  to  explain 
the  behavior  of  any  other  measurable  air  pollutant  like 
oxides  of  sulfur,  oxides  of  carbon  and  particulate  matter. 

In  addition f further  investigation  employing  dynamic  system 
model  approach  could  provide  a  powerful  tool  for  air 
pollution  control  in  urban  centers . 
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Appendix  A 


Data  Organization  on  the  Magnetic  Tape 

On  magnetic  tape  all  the  data  for  a  month  are 
stored  in  a  file,  thus  there  are  93  files  on  the  tape. 

The  first  file  contains  data  for  July  1964,  the  second 
for  August,  1964  and  so  on  till  the  93rd  file  which 
contains  data  for  March  1972.  As  more  monthly  data  are 
acquired,  files  can  be  created  for  them  starting  from 
file  94  for  April  1972. 

The  files  have  variable  size  depending  on  the  number 
of  records  each  contains.  A  record  is  80  bytes  which 
is  the  size  of  an  80  column  card.  The  records  are 
blocked  and  the  block  size  of  7200  bytes  is  fixed.  Since 
some  of  the  data  are  alphanumeric  (i.e.  wind  speed  and 
direction)  while  the  others  are  numeric  the  data  are 
stored  on  tape  in  A  -  format  so  as  to  make  them  re¬ 
trievable  in  any  suitable  format. 

The  first  62,  60,  58,  or  56  records  (depending  on 
the  number  of  days  in  the  month)  in  a  file  contains  the 
daily  wind  records  for  two  stations  which  are  the  Edmonton 
Industrial  Airport  and  Edmonton  ^International  Airport  . 

Each  station  has  12  two-hourly  readings  of  wind  speed 
and  direction  on  its  record  per  day.  For  all  the  records 
the  first  two  columns  contain  identification  code  for 
the  record.  The  wind  record  has  '21'  as  its  identification 
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code.  The  next  six  columns  give  year,  month,  and  day, 
which  are  then  followed  by  columns  containing  wind 
direction  and  speed. 

The  pollutants  1  records  follow  the  wind  data  in  the 
file.  Like  wind  record  the  first  2  columns  contain  code 
that  identifies  the  pollutant.  The  next  6  columns  give 
year,  month,  and  day.  The  list  that  follows  show  the 
code  dictionary  for  the  pollutants. 


Code  Pollutant 


31 

Soiling 

Index 

(Smoke) 

for 

Station 

1 

32 

Soiling 

Index 

(Smoke) 

for 

Station 

2 

33 

Soiling 

Index 

(Smoke) 

for 

Station 

3 

34 

Soiling 

Index 

(Smoke) 

for 

Station 

4 

35 

Soiling 

Index 

(Smoke) 

for 

Station 

5 

36 

Soiling 

Index 

(Smoke) 

for 

Station 

6 

41  NO  hourly  average 

X 

51  NO  five  minute  peak 

x 

42  N02  hourly  average 

52  N02  five  minute  peak 

61  Oxidant  hourly  average 

71  Hydrocarbon  hourly  average 

81  Carbon  Monoxide  hourly  average 

Data  for  soiling  index  of  smoke  or  coefficient  of 
haze  (COH)  are  two  hourly  measurements,  so  there  are  12 


data  points  per  station  per  day.  For  other  pollutants 
there  are  24  data  points  per  day. 


In  each  record,  the  actual  data  starts  from  the  9th 
column.  Each  data  point  for  the  pollutants  is  allocated 
3  columns .  Interpretation  of  the  3  -  column  data  point 
for  each  pollutant  is  the  following. 


Pollutant 

Value  of  Retrieved  Measurement 

COH 

X . XX  units 

NO 

X 

.XXX  ppm 

N02 

. XXX  ppm 

Oxidant 

XX . X  pphm 

Hydrocarbon 

XX . X  ppm 

Carbon  Monoxide 

XX . X  ppm 

If  999  is  encountered  in 

the  data  field,  it  indicates 

that  there  is  no  measurement  for  that  pollutant  at  that 
particular  time. 


* 

Appendix  B 


Initial  Estimate  Algorithm 

(1)  Perform  the  necessary  transformation  and  differencing 
to  obtain  w^  from  zt  .  Thus,  the  process  to  be 
investigated  is  an  ARMA (p ,  q) . 

(2)  Estimate  the  autocovariance  function,  C(k),  of  the 
series  wfc  . 

(3)  Estimate  the  autoregressive  parameters  a^  ,  a2  •  . 

a^  from  the  autocovariances  by  solving  the  following 
set  of  equations 

a-LC(q)  +  a2C(q  -  1)  +  .  +  apC(q  -  P  +  U  =  C(q+1) 

a1C(q  +  1)  +  a2C(q)  +  .  +  apC(q  “  P  +  2)  =  C(q+2) 

•  •  • 

•  •  • 

•  <*  • 

apC (q+p-1) -+la2C (q+p~2)  +  ......  +  apC(q)  -'C(qtp) 

A  A  A 

(4)  Use  the  estimates  a-^  ,  a2  ,  . .  a^  obtained  in 

(3)  to  obtain  modified  covariance  sequence  C'(j)  for 
j  =  0,  1,  . .  q  as  follows: 
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i  p  p  ^  ^ 

c(j)=  Z  Z  a.  a  C(|  j  +  i  -  k|) 

i=0  k=0  1  K 

for  p  >  0,  (aQ  =  -1) 

C  ( j )  =  C ( j )  for  p  =  0 . 

!  I  I 

(5)  Use  autocovariances  C  (0),  C  (1),  .......  C  (q)  to 

iteratively  compute  estimates  of  the  moving  average 

parameters  ,  b 2  ,  . ,  b^  applying  the 

Newton  -  Raphson  algorithm  developed  by  Wilson  [78] 
and  summarized  below 

(a)  Obtain  initial  values  of  X's  given  by 

xQ  =  /cToT 

Xj  =  0,  j  =  1,  . .  q 

(b)  Start  iteration 

(i)  Obtain  F ^  given  by 

r)  ■  %  Vi*)  ■c'l)l 

j  =0,1,  . .  q. 

(ii)  Obtain  Matrix  T  which  is  given  by 


■ 
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"■  — 

xn  X,  ..  ..  X 

0  1  q 

Xq  .  .  .  .  Xg 

o 

t1 

• 

• 

CM 

X 

t — 1 

X 

0  X0  xi  ••  xq-l 

X0  X^  X  0  0 

2  3  q 

•  •  «  * 

+ 

•  • 

•  O 

•  •  •  • 

•  •  •  9 

«  » 

«  • 

« 

X  0  0  0 

< 

X 

o 

o 

L  9 

U 

mm 

(iii)  Obtain  ,  j  =  0,  1,  . ,  q  where  is  the 

correction  factor  for  by  solving  the  set  of 
linear  equations  given  by 


T  H  =  F 

/ 


where  T  is  the  q  +  1  by  q  +  1  matrix,  H  and  F 
are  vectors  having  q  +  1  elements  each. 

(iv)  Obtain  new  values  of  Xj  given  by 

Xj  =  X j  ~  H j  ;  for  j  =  0 ,  . ,  q 

(v)  If  |F.,|  <  epsilon  orjHu  |  <  epsilon  for 

j  =  o,  1,  . ,  q  where  epsilon  is  a  small 

number,  say,  0.001  then  go  to  (c)  to  obtain  the 

estimates  of  b-^  ,  b ^  ,  . .  b^  ,  otherwise 

go  back  to  (b) . 


A 


(c)  Obtain  parameter  estimates  b.  given  by 

3 


b  . 
3 


Vxo 


-  q  . 


Obtain  also  residual  variance  estimate, 
given  by 


a 

e 


2 


(6)  Output  results. 


Appendix  C 


Model  Estimation  Algorithm 

(1)  Input  the  number  p,  and  values  of  the  initial 
estimates  of  autoregressive  parameters  if  any. 

Input  the  number  q,  and  values  of  the  initial  estimates 
of  moving  average  parameters  if  any. 

(2)  Perform  the  necessary  transformations  and  dif¬ 
ferencing  on  the  raw  data  to  obtain  series  {w^}. 

(3)  Estimate  series  {et}  of  the  residuals  using  the 
possible  model  and  the  current  estimates  of  the  para¬ 
meters  . 

(4)  Estimate  sum  of  squares,  SS ,  of  residuals,  given  by 

n  2 

SS  =  E  e.  where  n  is  the  length  of  w  series. 
i=l  1 

(5)  Repeat  steps  (3)  and  (4)  for  all  possible  values  in 
the  admissible  region  of  the  parameters.  This 
produces  a  grid  of  residual  sum  of  squares. 

(6)  Pick  the  set  of  parameters  which  give  minimum  sum 
of  squares,  SS ,  in  the  grid. 
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2 


of  the  residual  variance, 


Obtain  the  estimate 
2 

,  given  by 


S 

e 


Se2  =  SS/(n  -  1  -  q  -  2p)  . 


(7)  Output  the  efficient  estimates  of  the  model's 
parameters  and  the  estimate  of  the  residual 


variance . 
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Appendix  D 


Transfer  Function  Model  Identification  Algorithm 

(1)  Plot  the  output  and  input  series  together  to  detect 
visually,  if  possible,  the  relationship  between  them. 

(2)  Fit  stochastic  model  of  the  type  discussed  in 
Chapter  3  to  the  input  series. 

(3)  Prewhiten  the  input  and  transform  the  output 
according  to  the  model  fitted  to  the  input.  This  will 
give  a  series  of  residuals  y  ,  of  the  input  which  are 
not  autocorrelated ,  and  a  series  of  residuals  w^  , 

of  the  output  which  may  or  may  not  be  autocorrelated. 

(4)  Calculate  the  crosscorrelation  function  r(k), 

k  =  1,  2,  . .  K  of  yt  and  w^  where  K  =  14. 

(5)  Use  r(k)  to  obtain  the  impulse  response  function 
v^  given  by 

v,  =  r(k)  s  /s 
k  wy 

(6)  Use  r(k)  to  determine  d,p  and  s  as  explained  in 
section  4.1. 
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(7)  Use  v,  to  estimate  parameters  a.  ,  i  =  1,  . .  p, 

K  1 

and  b .  ,  i  =  0 ,  . ,  s . 

(8)  Carry  out  steps  (1)  to  (7)  for  each  leading  indicator. 


(9)  Output  results. 


Appendix  E 

Marquardt  Algorithm  for  Estimating  Parameters 

(1)  Input  the  number  m,  of  parameters  to  be  estimated 

/N 

and  the  initial  values  p^  ,  i  =  1,  . ,  m,  of 

the  parameters.  Estimate  the  residual  sum  of  squares 
due  to  the  initial  value  of  the  parameters. 

(2)  Initialize  iteration  counter  I  to  zero.  Set  the 

maximum  number  IUP ,  of  iterations  that  will  be 
allowed,  say  IUP  =  10.  Specify  R  =  0.01,  say,  and 
F  =  10  where  R  and  F  constrain  the  search.  Specify 
also  an  epsilon  EP  =  0.001,  as  the  convergence  para¬ 
meter.  Specify  a  delta  del  =  0.01  that  will 

be  used  to  perturb  the  parameters  one  at  a  time 
when  the  derivatives  of  et  are  being  calculated. 

First  Stage  of  Iteration 

(3)  1  =  1  +  1 

If  I  >  IUP  go  to  (17)  . 

(4)  Obtain  the  residual  series,  efc  ,  from  the  model 
using  the  current  value  of  the  parameters . 

(5)  Calculate  the  partial  derivative  of  each  et  in  the 
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series  with  respect  to  different  parameters  as 
follows : 


(a)  Derivative  Xi  of  et(p1  ,  p2  ,  . ,  Pm) 

with  respect  to  parameter  p^  is  given  by 


{et(Pl  '  P2  . 

“  o^_(p^  ,  p2  /  ......  f  p.^  +  del ,  .  •  •  t 

P±  +  del,  . . . ,  pm) }/ del . 


(b)  Calculate  X.  as  in  (a)  for  all  e  , 

1  /  u  "C. 

t  =  1,  . .  n  where  n  is  the  size  of  the 

series . 

(c)  Carry  out  steps  (a)  and  (b)  for  all  p^, 

i  =  l,  . ,m.  Thus  for  m  parameters  there 

will  be  m  vectors  each  containing  n  X.  ,  and 

1 ,  r. 

all  the  m  vectors  constitute  an  n  bym  matrix. 


(6)  Form  an  m  by  *m  matrix 


A 


lAij }  - 


where 


A.  . 
ID 


n 

Z 

t=l 


X.  .X.  , 

1ft  J,t 


(7)  Form  vector  G  having  m  elements  where 


. 


■ 
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n 

G.  =  EX.  e . 
i  t=1  ift  t 


(8)  Obtain  a  vector  D,  of  scaling  quantities  having  m 
elements  where 


D  .  =  /(A •  .  j 

l  li 


Second  Stage  of  Iteration 


(9)  Obtain  the  modified  linearized  equations  as  follows 


A* . .  =  A. ./D. D .  ,  i  4=  j 

lj  ij  l  j  1 

A* . .  =  1  +  R 
n  ' 

G*.  =  G./D. 

l  li* 


(10)  Solve  the  following  set  of  linear  equations  for 


H* 


A*H*  =  G*  . 


(11)  Scale  H*  back  to  H  to  give  the  parameter  corrections 
thus 


H.  =  H*./D.  . 

l  l  i 


(12)  Obtain  the  new  parameter  values 

P^  =  +  IF  ,  i  =  1,  . ,  m. 

(13)  Estimate  sum  of  squares,  SS,  of  the  residuals  due  to 
the  new  parameters .  Compare  this  with  the  old  sum 
of  squares,  SO,  due  to  the  old  parameters. 

(14)  If  SS  <  SO  go  to  (16) . 

(15)  If  SS  >  SO  multiply  R  by  F.  Test  if  R  is  greater 
than  its  upper  bound,  UP,  say  UP  =  2 . 

If  R  <  UP,  go  to  9;  otherwise  the  search  has  failed, 
then  go  to  17 . 

(16)  Test  all  IF  ,  i  =  1,  . .  m. 

(a)  If  every  IF  is  less  than  EP ,  the  optimal  esti¬ 
mates  of  the  parameters  have  been  obtained. 
Obtain  also  the  covariance  matrix  of  the  para¬ 
meter  estimates  which  is  the  inverse  of  the  m 
by  m  matrix  A. 

Variance  estimate  of  parameter  p^  is  A  . 

(b)  If  the  condition  in  (a)  is  not  satisfied,  set 
old  values  of  the  parameters  p^  to  their  new 
values.  Divide  R  by  F.  Go  back  to  (3) . 

(17)  Output  results. 


.m  . .  ,  i  -  x  '  i 11  iq  ='  iq 
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SERIES  A  NO  HOURLY  AVERAGES  FROM  FEBRUARY  22  TO  MARCH  7, 

X 

1967* 


0.004 

0.001 

0.002 

0.001 
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*The  first  record  is  the  observation  for  midnight  to  1  a.m. 
on  February  22  and  consecutive  observations  are  listed  row 
wise.  Four  rows  constitute  observations  for  one  day.  The 
observations  are  in  ppm. 
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0.083  0.124  0.052  0.054  0.100  0.281  0.151 

0.113  0.132 


*The  first  record  is  the  observation  for  April  1,  1971  and 
consecutive  observations  are  listed  row  wise.  Each  row 
constitures  observations  for  a  week.  The  observations  are 
in  ppm. 
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*Re ad  across  the  page.  Measurements  are  in  degrees 
Fahrenheit. 
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HOURLY  WIND  SPEED* 
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